Overview

Brought to you by YData

Dataset statistics

Number of variables93
Number of observations604720
Missing cells35800281
Missing cells (%)63.7%
Total size in memory429.1 MiB
Average record size in memory744.0 B

Variable types

Text93

Dataset

DescriptionEntomology NMNH Extant Extant Specimen Records 0052484-241126133413365
URLhttps://doi.org/10.15468/dl.ptewed

Alerts

institutionID has constant value "urn:lsid:biocol.org:col:34871" Constant
collectionID has constant value "urn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad" Constant
institutionCode has constant value "USNM" Constant
collectionCode has constant value "ENT" Constant
datasetName has constant value "NMNH Extant Biology" Constant
organismID has constant value "70 21'9"W" Constant
eventType has constant value "-11.7815" Constant
waterBody has constant value "DeMarmels" Constant
verbatimDepth has constant value "220m inside cave entrance" Constant
locationRemarks has constant value "Garrison, Rosser W." Constant
verbatimSRS has constant value "Argia" Constant
footprintSpatialFit has constant value "Gynacantha membranalis" Constant
georeferencedBy has constant value "orichalcea" Constant
earliestEonOrLowestEonothem has constant value "Animalia, Arthropoda, Insecta, Odonata, Anisoptera, Aeshnidae" Constant
latestEonOrHighestEonothem has constant value "Animalia" Constant
earliestEraOrLowestErathem has constant value "Arthropoda" Constant
latestEraOrHighestErathem has constant value "Insecta" Constant
latestEpochOrHighestSeries has constant value "Pinellas" Constant
lowestBiostratigraphicZone has constant value "Gynacantha" Constant
formation has constant value "membranalis" Constant
identificationReferences has constant value "WGS 84 (EPSG:4326)" Constant
originalNameUsage has constant value "Google Earth" Constant
kingdom has constant value "Animalia" Constant
vernacularName has constant value "Type" Constant
catalogNumber has 233452 (38.6%) missing values Missing
recordNumber has 604683 (> 99.9%) missing values Missing
recordedBy has 203369 (33.6%) missing values Missing
sex has 339511 (56.1%) missing values Missing
lifeStage has 174155 (28.8%) missing values Missing
preparations has 42056 (7.0%) missing values Missing
associatedMedia has 390092 (64.5%) missing values Missing
occurrenceRemarks has 459346 (76.0%) missing values Missing
organismID has 604719 (> 99.9%) missing values Missing
eventType has 604719 (> 99.9%) missing values Missing
fieldNumber has 600468 (99.3%) missing values Missing
eventDate has 239420 (39.6%) missing values Missing
startDayOfYear has 244789 (40.5%) missing values Missing
endDayOfYear has 244303 (40.4%) missing values Missing
year has 239420 (39.6%) missing values Missing
month has 246636 (40.8%) missing values Missing
day has 270887 (44.8%) missing values Missing
verbatimEventDate has 396366 (65.5%) missing values Missing
habitat has 604521 (> 99.9%) missing values Missing
locationID has 603675 (99.8%) missing values Missing
higherGeography has 156093 (25.8%) missing values Missing
continent has 604592 (> 99.9%) missing values Missing
waterBody has 604719 (> 99.9%) missing values Missing
islandGroup has 602200 (99.6%) missing values Missing
island has 595353 (98.5%) missing values Missing
country has 156115 (25.8%) missing values Missing
stateProvince has 173239 (28.6%) missing values Missing
county has 254867 (42.1%) missing values Missing
locality has 158363 (26.2%) missing values Missing
minimumElevationInMeters has 558058 (92.3%) missing values Missing
maximumElevationInMeters has 573266 (94.8%) missing values Missing
verbatimElevation has 594785 (98.4%) missing values Missing
minimumDepthInMeters has 604685 (> 99.9%) missing values Missing
maximumDepthInMeters has 604709 (> 99.9%) missing values Missing
verbatimDepth has 604714 (> 99.9%) missing values Missing
locationRemarks has 604719 (> 99.9%) missing values Missing
decimalLatitude has 285696 (47.2%) missing values Missing
decimalLongitude has 285696 (47.2%) missing values Missing
geodeticDatum has 578337 (95.6%) missing values Missing
coordinateUncertaintyInMeters has 592766 (98.0%) missing values Missing
coordinatePrecision has 604717 (> 99.9%) missing values Missing
pointRadiusSpatialFit has 604718 (> 99.9%) missing values Missing
verbatimCoordinates has 604718 (> 99.9%) missing values Missing
verbatimLatitude has 523062 (86.5%) missing values Missing
verbatimLongitude has 523032 (86.5%) missing values Missing
verbatimCoordinateSystem has 604717 (> 99.9%) missing values Missing
verbatimSRS has 604719 (> 99.9%) missing values Missing
footprintSpatialFit has 604719 (> 99.9%) missing values Missing
georeferencedBy has 604719 (> 99.9%) missing values Missing
georeferenceProtocol has 366819 (60.7%) missing values Missing
georeferenceRemarks has 596270 (98.6%) missing values Missing
geologicalContextID has 604716 (> 99.9%) missing values Missing
earliestEonOrLowestEonothem has 604719 (> 99.9%) missing values Missing
latestEonOrHighestEonothem has 604719 (> 99.9%) missing values Missing
earliestEraOrLowestErathem has 604719 (> 99.9%) missing values Missing
latestEraOrHighestErathem has 604719 (> 99.9%) missing values Missing
earliestPeriodOrLowestSystem has 604716 (> 99.9%) missing values Missing
earliestEpochOrLowestSeries has 604717 (> 99.9%) missing values Missing
latestEpochOrHighestSeries has 604719 (> 99.9%) missing values Missing
latestAgeOrHighestStage has 604717 (> 99.9%) missing values Missing
lowestBiostratigraphicZone has 604719 (> 99.9%) missing values Missing
formation has 604719 (> 99.9%) missing values Missing
identificationQualifier has 603282 (99.8%) missing values Missing
typeStatus has 486142 (80.4%) missing values Missing
identifiedBy has 455024 (75.2%) missing values Missing
identifiedByID has 604718 (> 99.9%) missing values Missing
dateIdentified has 604718 (> 99.9%) missing values Missing
identificationReferences has 604719 (> 99.9%) missing values Missing
originalNameUsage has 604719 (> 99.9%) missing values Missing
kingdom has 6300 (1.0%) missing values Missing
subgenus has 512525 (84.8%) missing values Missing
specificEpithet has 8751 (1.4%) missing values Missing
infraspecificEpithet has 571231 (94.5%) missing values Missing
taxonRank has 571236 (94.5%) missing values Missing
scientificNameAuthorship has 90502 (15.0%) missing values Missing
vernacularName has 604718 (> 99.9%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique

Reproduction

Analysis started2025-01-14 16:39:48.938174
Analysis finished2025-01-14 16:40:13.576344
Duration24.64 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct604720
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:40:14.090192image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters6047200
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique604720 ?
Unique (%)100.0%

Sample

1st row1321729650
2nd row1320180785
3rd row4403931423
4th row1320185860
5th row1320185980
ValueCountFrequency (%)
1321729650 1
 
< 0.1%
1321751610 1
 
< 0.1%
1828939237 1
 
< 0.1%
1321753851 1
 
< 0.1%
4403917418 1
 
< 0.1%
1321742115 1
 
< 0.1%
4403931423 1
 
< 0.1%
1320185860 1
 
< 0.1%
1320185980 1
 
< 0.1%
2236094411 1
 
< 0.1%
Other values (604710) 604710
> 99.9%
2025-01-14T11:40:14.661915image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1132979
18.7%
3 860538
14.2%
2 781913
12.9%
0 530707
8.8%
8 513756
8.5%
9 488229
8.1%
7 474017
7.8%
4 451821
 
7.5%
5 410705
 
6.8%
6 402535
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6047200
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1132979
18.7%
3 860538
14.2%
2 781913
12.9%
0 530707
8.8%
8 513756
8.5%
9 488229
8.1%
7 474017
7.8%
4 451821
 
7.5%
5 410705
 
6.8%
6 402535
 
6.7%

Most occurring scripts

ValueCountFrequency (%)
Common 6047200
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1132979
18.7%
3 860538
14.2%
2 781913
12.9%
0 530707
8.8%
8 513756
8.5%
9 488229
8.1%
7 474017
7.8%
4 451821
 
7.5%
5 410705
 
6.8%
6 402535
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6047200
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1132979
18.7%
3 860538
14.2%
2 781913
12.9%
0 530707
8.8%
8 513756
8.5%
9 488229
8.1%
7 474017
7.8%
4 451821
 
7.5%
5 410705
 
6.8%
6 402535
 
6.7%
Distinct56593
Distinct (%)9.4%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:40:14.884251image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters11489680
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30780 ?
Unique (%)5.1%

Sample

1st row2013-09-16 11:56:00
2nd row2016-06-09 14:33:00
3rd row2023-08-23 09:36:00
4th row2023-05-19 10:32:00
5th row2015-10-05 15:58:00
ValueCountFrequency (%)
2023-05-13 60773
 
5.0%
2017-04-17 42518
 
3.5%
2014-01-09 31212
 
2.6%
2023-05-15 20528
 
1.7%
2023-05-12 16800
 
1.4%
2015-10-06 15979
 
1.3%
2018-02-08 14193
 
1.2%
2015-10-05 10265
 
0.8%
2017-09-29 10242
 
0.8%
11:48:00 10115
 
0.8%
Other values (3141) 976815
80.8%
2025-01-14T11:40:15.177232image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2928318
25.5%
1 1566388
13.6%
2 1372552
11.9%
- 1209440
10.5%
: 1209440
10.5%
604720
 
5.3%
3 593130
 
5.2%
5 494673
 
4.3%
4 456568
 
4.0%
9 314335
 
2.7%
Other values (3) 740116
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8466080
73.7%
Dash Punctuation 1209440
 
10.5%
Other Punctuation 1209440
 
10.5%
Space Separator 604720
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2928318
34.6%
1 1566388
18.5%
2 1372552
16.2%
3 593130
 
7.0%
5 494673
 
5.8%
4 456568
 
5.4%
9 314335
 
3.7%
7 310958
 
3.7%
6 238204
 
2.8%
8 190954
 
2.3%
Dash Punctuation
ValueCountFrequency (%)
- 1209440
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1209440
100.0%
Space Separator
ValueCountFrequency (%)
604720
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 11489680
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2928318
25.5%
1 1566388
13.6%
2 1372552
11.9%
- 1209440
10.5%
: 1209440
10.5%
604720
 
5.3%
3 593130
 
5.2%
5 494673
 
4.3%
4 456568
 
4.0%
9 314335
 
2.7%
Other values (3) 740116
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11489680
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2928318
25.5%
1 1566388
13.6%
2 1372552
11.9%
- 1209440
10.5%
: 1209440
10.5%
604720
 
5.3%
3 593130
 
5.2%
5 494673
 
4.3%
4 456568
 
4.0%
9 314335
 
2.7%
Other values (3) 740116
 
6.4%

institutionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:40:15.251226image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters17536880
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:lsid:biocol.org:col:34871
2nd rowurn:lsid:biocol.org:col:34871
3rd rowurn:lsid:biocol.org:col:34871
4th rowurn:lsid:biocol.org:col:34871
5th rowurn:lsid:biocol.org:col:34871
ValueCountFrequency (%)
urn:lsid:biocol.org:col:34871 604720
100.0%
2025-01-14T11:40:15.362247image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 2418880
13.8%
: 2418880
13.8%
l 1814160
 
10.3%
i 1209440
 
6.9%
r 1209440
 
6.9%
c 1209440
 
6.9%
g 604720
 
3.4%
7 604720
 
3.4%
8 604720
 
3.4%
4 604720
 
3.4%
Other values (8) 4837760
27.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11489680
65.5%
Other Punctuation 3023600
 
17.2%
Decimal Number 3023600
 
17.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 2418880
21.1%
l 1814160
15.8%
i 1209440
10.5%
r 1209440
10.5%
c 1209440
10.5%
g 604720
 
5.3%
u 604720
 
5.3%
b 604720
 
5.3%
d 604720
 
5.3%
s 604720
 
5.3%
Decimal Number
ValueCountFrequency (%)
7 604720
20.0%
8 604720
20.0%
4 604720
20.0%
3 604720
20.0%
1 604720
20.0%
Other Punctuation
ValueCountFrequency (%)
: 2418880
80.0%
. 604720
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11489680
65.5%
Common 6047200
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 2418880
21.1%
l 1814160
15.8%
i 1209440
10.5%
r 1209440
10.5%
c 1209440
10.5%
g 604720
 
5.3%
u 604720
 
5.3%
b 604720
 
5.3%
d 604720
 
5.3%
s 604720
 
5.3%
Common
ValueCountFrequency (%)
: 2418880
40.0%
7 604720
 
10.0%
8 604720
 
10.0%
4 604720
 
10.0%
3 604720
 
10.0%
. 604720
 
10.0%
1 604720
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17536880
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 2418880
13.8%
: 2418880
13.8%
l 1814160
 
10.3%
i 1209440
 
6.9%
r 1209440
 
6.9%
c 1209440
 
6.9%
g 604720
 
3.4%
7 604720
 
3.4%
8 604720
 
3.4%
4 604720
 
3.4%
Other values (8) 4837760
27.6%

collectionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:40:15.423329image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters27212400
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad
2nd rowurn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad
3rd rowurn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad
4th rowurn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad
5th rowurn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad
ValueCountFrequency (%)
urn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad 604720
100.0%
2025-01-14T11:40:15.539881image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3023600
 
11.1%
a 2418880
 
8.9%
- 2418880
 
8.9%
d 1814160
 
6.7%
c 1814160
 
6.7%
u 1814160
 
6.7%
8 1209440
 
4.4%
3 1209440
 
4.4%
: 1209440
 
4.4%
9 1209440
 
4.4%
Other values (12) 9070800
33.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12094400
44.4%
Decimal Number 11489680
42.2%
Dash Punctuation 2418880
 
8.9%
Other Punctuation 1209440
 
4.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3023600
26.3%
8 1209440
 
10.5%
3 1209440
 
10.5%
9 1209440
 
10.5%
6 1209440
 
10.5%
2 1209440
 
10.5%
1 604720
 
5.3%
4 604720
 
5.3%
7 604720
 
5.3%
5 604720
 
5.3%
Lowercase Letter
ValueCountFrequency (%)
a 2418880
20.0%
d 1814160
15.0%
c 1814160
15.0%
u 1814160
15.0%
b 1209440
10.0%
e 604720
 
5.0%
i 604720
 
5.0%
r 604720
 
5.0%
n 604720
 
5.0%
f 604720
 
5.0%
Dash Punctuation
ValueCountFrequency (%)
- 2418880
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1209440
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 15118000
55.6%
Latin 12094400
44.4%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3023600
20.0%
- 2418880
16.0%
8 1209440
 
8.0%
3 1209440
 
8.0%
: 1209440
 
8.0%
9 1209440
 
8.0%
6 1209440
 
8.0%
2 1209440
 
8.0%
1 604720
 
4.0%
4 604720
 
4.0%
Other values (2) 1209440
 
8.0%
Latin
ValueCountFrequency (%)
a 2418880
20.0%
d 1814160
15.0%
c 1814160
15.0%
u 1814160
15.0%
b 1209440
10.0%
e 604720
 
5.0%
i 604720
 
5.0%
r 604720
 
5.0%
n 604720
 
5.0%
f 604720
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27212400
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3023600
 
11.1%
a 2418880
 
8.9%
- 2418880
 
8.9%
d 1814160
 
6.7%
c 1814160
 
6.7%
u 1814160
 
6.7%
8 1209440
 
4.4%
3 1209440
 
4.4%
: 1209440
 
4.4%
9 1209440
 
4.4%
Other values (12) 9070800
33.3%

institutionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:40:15.584193image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2418880
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUSNM
2nd rowUSNM
3rd rowUSNM
4th rowUSNM
5th rowUSNM
ValueCountFrequency (%)
usnm 604720
100.0%
2025-01-14T11:40:15.685685image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 604720
25.0%
S 604720
25.0%
N 604720
25.0%
M 604720
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2418880
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 604720
25.0%
S 604720
25.0%
N 604720
25.0%
M 604720
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2418880
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 604720
25.0%
S 604720
25.0%
N 604720
25.0%
M 604720
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2418880
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 604720
25.0%
S 604720
25.0%
N 604720
25.0%
M 604720
25.0%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:40:15.730350image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1814160
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowENT
2nd rowENT
3rd rowENT
4th rowENT
5th rowENT
ValueCountFrequency (%)
ent 604720
100.0%
2025-01-14T11:40:15.831827image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 604720
33.3%
N 604720
33.3%
T 604720
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1814160
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 604720
33.3%
N 604720
33.3%
T 604720
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1814160
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 604720
33.3%
N 604720
33.3%
T 604720
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1814160
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 604720
33.3%
N 604720
33.3%
T 604720
33.3%

datasetName
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:40:15.878722image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters11489680
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNMNH Extant Biology
2nd rowNMNH Extant Biology
3rd rowNMNH Extant Biology
4th rowNMNH Extant Biology
5th rowNMNH Extant Biology
ValueCountFrequency (%)
nmnh 604720
33.3%
extant 604720
33.3%
biology 604720
33.3%
2025-01-14T11:40:15.984271image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 1209440
 
10.5%
1209440
 
10.5%
t 1209440
 
10.5%
o 1209440
 
10.5%
M 604720
 
5.3%
H 604720
 
5.3%
E 604720
 
5.3%
x 604720
 
5.3%
a 604720
 
5.3%
n 604720
 
5.3%
Other values (5) 3023600
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6651920
57.9%
Uppercase Letter 3628320
31.6%
Space Separator 1209440
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1209440
18.2%
o 1209440
18.2%
x 604720
9.1%
a 604720
9.1%
n 604720
9.1%
i 604720
9.1%
l 604720
9.1%
g 604720
9.1%
y 604720
9.1%
Uppercase Letter
ValueCountFrequency (%)
N 1209440
33.3%
M 604720
16.7%
H 604720
16.7%
E 604720
16.7%
B 604720
16.7%
Space Separator
ValueCountFrequency (%)
1209440
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10280240
89.5%
Common 1209440
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 1209440
11.8%
t 1209440
11.8%
o 1209440
11.8%
M 604720
 
5.9%
H 604720
 
5.9%
E 604720
 
5.9%
x 604720
 
5.9%
a 604720
 
5.9%
n 604720
 
5.9%
B 604720
 
5.9%
Other values (4) 2418880
23.5%
Common
ValueCountFrequency (%)
1209440
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11489680
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 1209440
 
10.5%
1209440
 
10.5%
t 1209440
 
10.5%
o 1209440
 
10.5%
M 604720
 
5.3%
H 604720
 
5.3%
E 604720
 
5.3%
x 604720
 
5.3%
a 604720
 
5.3%
n 604720
 
5.3%
Other values (5) 3023600
26.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:40:16.037386image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length17
Mean length16.99375083
Min length16

Characters and Unicode

Total characters10276461
Distinct characters19
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPreservedSpecimen
2nd rowPreservedSpecimen
3rd rowPreservedSpecimen
4th rowPreservedSpecimen
5th rowPreservedSpecimen
ValueCountFrequency (%)
preservedspecimen 600941
99.4%
humanobservation 3779
 
0.6%
2025-01-14T11:40:16.150658image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3008484
29.3%
r 1205661
11.7%
n 608499
 
5.9%
i 604720
 
5.9%
s 604720
 
5.9%
v 604720
 
5.9%
m 604720
 
5.9%
c 600941
 
5.8%
P 600941
 
5.8%
p 600941
 
5.8%
Other values (9) 1232114
12.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9067021
88.2%
Uppercase Letter 1209440
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3008484
33.2%
r 1205661
13.3%
n 608499
 
6.7%
i 604720
 
6.7%
s 604720
 
6.7%
v 604720
 
6.7%
m 604720
 
6.7%
c 600941
 
6.6%
p 600941
 
6.6%
d 600941
 
6.6%
Other values (5) 22674
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
P 600941
49.7%
S 600941
49.7%
H 3779
 
0.3%
O 3779
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 10276461
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3008484
29.3%
r 1205661
11.7%
n 608499
 
5.9%
i 604720
 
5.9%
s 604720
 
5.9%
v 604720
 
5.9%
m 604720
 
5.9%
c 600941
 
5.8%
P 600941
 
5.8%
p 600941
 
5.8%
Other values (9) 1232114
12.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10276461
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3008484
29.3%
r 1205661
11.7%
n 608499
 
5.9%
i 604720
 
5.9%
s 604720
 
5.9%
v 604720
 
5.9%
m 604720
 
5.9%
c 600941
 
5.8%
P 600941
 
5.8%
p 600941
 
5.8%
Other values (9) 1232114
12.0%

occurrenceID
Text

Unique 

Distinct604720
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:40:16.548184image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length63
Mean length63
Min length63

Characters and Unicode

Total characters38097360
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique604720 ?
Unique (%)100.0%

Sample

1st rowhttp://n2t.net/ark:/65665/3c83a10d1-1e59-4b08-af5b-28d12d2d0c80
2nd rowhttp://n2t.net/ark:/65665/383bb510d-d5ce-4c09-b4c4-bc1482fbaf28
3rd rowhttp://n2t.net/ark:/65665/383f13aa6-a5b6-40bc-bddc-b42c557aebfc
4th rowhttp://n2t.net/ark:/65665/383f4d560-c2d2-485c-906c-b6dad303fd7a
5th rowhttp://n2t.net/ark:/65665/383f634da-bb58-423c-85f4-a267b04c5ee5
ValueCountFrequency (%)
http://n2t.net/ark:/65665/3c83a10d1-1e59-4b08-af5b-28d12d2d0c80 1
 
< 0.1%
http://n2t.net/ark:/65665/3c932a059-56b2-4846-9e97-741d7bdde29c 1
 
< 0.1%
http://n2t.net/ark:/65665/384cb9f0c-76d8-41b2-9a2e-351c10a4ab3f 1
 
< 0.1%
http://n2t.net/ark:/65665/3c94d744a-d127-4564-9b0c-5d349a138dd0 1
 
< 0.1%
http://n2t.net/ark:/65665/384c3715b-7768-468a-b76b-a68ff7a554d0 1
 
< 0.1%
http://n2t.net/ark:/65665/3c8c6462b-a9e9-4efa-9205-6fb4e5ef4e65 1
 
< 0.1%
http://n2t.net/ark:/65665/383f13aa6-a5b6-40bc-bddc-b42c557aebfc 1
 
< 0.1%
http://n2t.net/ark:/65665/383f4d560-c2d2-485c-906c-b6dad303fd7a 1
 
< 0.1%
http://n2t.net/ark:/65665/383f634da-bb58-423c-85f4-a267b04c5ee5 1
 
< 0.1%
http://n2t.net/ark:/65665/3c898aee2-d463-49d7-ad9c-6fd423e170e1 1
 
< 0.1%
Other values (604710) 604710
> 99.9%
2025-01-14T11:40:16.983237image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 3023600
 
7.9%
6 2949751
 
7.7%
- 2418880
 
6.3%
t 2418880
 
6.3%
5 2343491
 
6.2%
a 1889528
 
5.0%
2 1739197
 
4.6%
e 1738583
 
4.6%
3 1737642
 
4.6%
4 1737535
 
4.6%
Other values (16) 16100273
42.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16480712
43.3%
Lowercase Letter 14360008
37.7%
Other Punctuation 4837760
 
12.7%
Dash Punctuation 2418880
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 2418880
16.8%
a 1889528
13.2%
e 1738583
12.1%
b 1286170
9.0%
n 1209440
8.4%
d 1134463
7.9%
c 1133059
7.9%
f 1131005
7.9%
k 604720
 
4.2%
r 604720
 
4.2%
Other values (2) 1209440
8.4%
Decimal Number
ValueCountFrequency (%)
6 2949751
17.9%
5 2343491
14.2%
2 1739197
10.6%
3 1737642
10.5%
4 1737535
10.5%
8 1286386
7.8%
9 1284901
7.8%
0 1134229
 
6.9%
1 1134029
 
6.9%
7 1133551
 
6.9%
Other Punctuation
ValueCountFrequency (%)
/ 3023600
62.5%
: 1209440
 
25.0%
. 604720
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 2418880
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23737352
62.3%
Latin 14360008
37.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 3023600
12.7%
6 2949751
12.4%
- 2418880
10.2%
5 2343491
9.9%
2 1739197
7.3%
3 1737642
7.3%
4 1737535
7.3%
8 1286386
 
5.4%
9 1284901
 
5.4%
: 1209440
 
5.1%
Other values (4) 4006529
16.9%
Latin
ValueCountFrequency (%)
t 2418880
16.8%
a 1889528
13.2%
e 1738583
12.1%
b 1286170
9.0%
n 1209440
8.4%
d 1134463
7.9%
c 1133059
7.9%
f 1131005
7.9%
k 604720
 
4.2%
r 604720
 
4.2%
Other values (2) 1209440
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38097360
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 3023600
 
7.9%
6 2949751
 
7.7%
- 2418880
 
6.3%
t 2418880
 
6.3%
5 2343491
 
6.2%
a 1889528
 
5.0%
2 1739197
 
4.6%
e 1738583
 
4.6%
3 1737642
 
4.6%
4 1737535
 
4.6%
Other values (16) 16100273
42.3%

catalogNumber
Text

Missing 

Distinct371254
Distinct (%)> 99.9%
Missing233452
Missing (%)38.6%
Memory size4.6 MiB
2025-01-14T11:40:17.254471image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length15
Mean length15.03873482
Min length12

Characters and Unicode

Total characters5583401
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique371240 ?
Unique (%)> 99.9%

Sample

1st rowUSNMENT00831303
2nd rowUSNMENT00356408
3rd rowUSNMENT01436172
4th rowUSNMENT00357025
5th rowUSNMENT00314717
ValueCountFrequency (%)
usnment00377587 2
 
< 0.1%
usnment00381323 2
 
< 0.1%
usnment00937212 2
 
< 0.1%
usnment00377617 2
 
< 0.1%
usnment00536541 2
 
< 0.1%
usnment00533165 2
 
< 0.1%
usnment00385557 2
 
< 0.1%
usnment01200936 2
 
< 0.1%
usnment00385731 2
 
< 0.1%
usnment00937214 2
 
< 0.1%
Other values (371244) 371248
> 99.9%
2025-01-14T11:40:17.589207image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 804733
14.4%
N 741878
13.3%
1 377023
 
6.8%
S 371268
 
6.6%
U 371224
 
6.6%
M 371224
 
6.6%
E 370648
 
6.6%
T 370648
 
6.6%
3 302855
 
5.4%
4 225937
 
4.0%
Other values (11) 1275963
22.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2982343
53.4%
Uppercase Letter 2596978
46.5%
Other Punctuation 4078
 
0.1%
Lowercase Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 804733
27.0%
1 377023
12.6%
3 302855
 
10.2%
4 225937
 
7.6%
2 225472
 
7.6%
5 215981
 
7.2%
8 215588
 
7.2%
7 210834
 
7.1%
6 202437
 
6.8%
9 201483
 
6.8%
Uppercase Letter
ValueCountFrequency (%)
N 741878
28.6%
S 371268
14.3%
U 371224
14.3%
M 371224
14.3%
E 370648
14.3%
T 370648
14.3%
C 44
 
< 0.1%
A 44
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
b 1
50.0%
a 1
50.0%
Other Punctuation
ValueCountFrequency (%)
. 4078
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2986421
53.5%
Latin 2596980
46.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 804733
26.9%
1 377023
12.6%
3 302855
 
10.1%
4 225937
 
7.6%
2 225472
 
7.5%
5 215981
 
7.2%
8 215588
 
7.2%
7 210834
 
7.1%
6 202437
 
6.8%
9 201483
 
6.7%
Latin
ValueCountFrequency (%)
N 741878
28.6%
S 371268
14.3%
U 371224
14.3%
M 371224
14.3%
E 370648
14.3%
T 370648
14.3%
C 44
 
< 0.1%
A 44
 
< 0.1%
b 1
 
< 0.1%
a 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5583401
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 804733
14.4%
N 741878
13.3%
1 377023
 
6.8%
S 371268
 
6.6%
U 371224
 
6.6%
M 371224
 
6.6%
E 370648
 
6.6%
T 370648
 
6.6%
3 302855
 
5.4%
4 225937
 
4.0%
Other values (11) 1275963
22.9%

recordNumber
Text

Missing 

Distinct33
Distinct (%)89.2%
Missing604683
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:17.698659image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length43
Median length26
Mean length17.13513514
Min length4

Characters and Unicode

Total characters634
Distinct characters57
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)86.5%

Sample

1st rowCollection number "14,957"
2nd rowLot 607, Sub 182
3rd row4012
4th rowDognin Collection
5th row12.097
ValueCountFrequency (%)
collection 10
 
10.0%
no 9
 
9.0%
walsingham 7
 
7.0%
dognin 5
 
5.0%
hopkins 3
 
3.0%
quaintance 2
 
2.0%
wlsm 2
 
2.0%
townes 2
 
2.0%
number 2
 
2.0%
from 2
 
2.0%
Other values (56) 56
56.0%
2025-01-14T11:40:17.878369image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
63
 
9.9%
o 52
 
8.2%
n 47
 
7.4%
l 39
 
6.2%
i 33
 
5.2%
. 26
 
4.1%
e 25
 
3.9%
a 22
 
3.5%
t 19
 
3.0%
1 19
 
3.0%
Other values (47) 289
45.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 348
54.9%
Decimal Number 114
 
18.0%
Uppercase Letter 67
 
10.6%
Space Separator 63
 
9.9%
Other Punctuation 38
 
6.0%
Dash Punctuation 2
 
0.3%
Open Punctuation 1
 
0.2%
Close Punctuation 1
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 52
14.9%
n 47
13.5%
l 39
11.2%
i 33
9.5%
e 25
 
7.2%
a 22
 
6.3%
t 19
 
5.5%
c 18
 
5.2%
s 16
 
4.6%
g 14
 
4.0%
Other values (11) 63
18.1%
Uppercase Letter
ValueCountFrequency (%)
C 14
20.9%
W 9
13.4%
N 9
13.4%
H 6
9.0%
D 5
 
7.5%
S 4
 
6.0%
M 3
 
4.5%
Q 2
 
3.0%
T 2
 
3.0%
U 2
 
3.0%
Other values (9) 11
16.4%
Decimal Number
ValueCountFrequency (%)
1 19
16.7%
7 15
13.2%
0 14
12.3%
8 14
12.3%
5 12
10.5%
9 12
10.5%
4 8
7.0%
2 8
7.0%
6 7
 
6.1%
3 5
 
4.4%
Other Punctuation
ValueCountFrequency (%)
. 26
68.4%
" 10
 
26.3%
, 2
 
5.3%
Space Separator
ValueCountFrequency (%)
63
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 415
65.5%
Common 219
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 52
 
12.5%
n 47
 
11.3%
l 39
 
9.4%
i 33
 
8.0%
e 25
 
6.0%
a 22
 
5.3%
t 19
 
4.6%
c 18
 
4.3%
s 16
 
3.9%
C 14
 
3.4%
Other values (30) 130
31.3%
Common
ValueCountFrequency (%)
63
28.8%
. 26
11.9%
1 19
 
8.7%
7 15
 
6.8%
0 14
 
6.4%
8 14
 
6.4%
5 12
 
5.5%
9 12
 
5.5%
" 10
 
4.6%
4 8
 
3.7%
Other values (7) 26
11.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 634
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
63
 
9.9%
o 52
 
8.2%
n 47
 
7.4%
l 39
 
6.2%
i 33
 
5.2%
. 26
 
4.1%
e 25
 
3.9%
a 22
 
3.5%
t 19
 
3.0%
1 19
 
3.0%
Other values (47) 289
45.6%

recordedBy
Text

Missing 

Distinct18727
Distinct (%)4.7%
Missing203369
Missing (%)33.6%
Memory size4.6 MiB
2025-01-14T11:40:18.119027image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length90
Median length84
Mean length11.25701693
Min length1

Characters and Unicode

Total characters4518015
Distinct characters83
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9104 ?
Unique (%)2.3%

Sample

1st rowM. Ortiz B.
2nd row[Not Stated]
3rd rowS. Roble
4th row[Not Stated]
5th rowC. Flint
ValueCountFrequency (%)
not 65723
 
7.2%
stated 65707
 
7.2%
l 40187
 
4.4%
39883
 
4.4%
j 36893
 
4.0%
macior 31234
 
3.4%
d 28472
 
3.1%
c 27158
 
3.0%
r 25638
 
2.8%
b 22051
 
2.4%
Other values (10691) 530867
58.1%
2025-01-14T11:40:18.415470image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
512462
 
11.3%
. 355587
 
7.9%
t 305186
 
6.8%
a 299390
 
6.6%
e 290122
 
6.4%
o 240216
 
5.3%
r 229316
 
5.1%
i 173792
 
3.8%
n 169878
 
3.8%
l 136877
 
3.0%
Other values (73) 1805189
40.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2588000
57.3%
Uppercase Letter 878362
 
19.4%
Space Separator 512462
 
11.3%
Other Punctuation 405464
 
9.0%
Open Punctuation 65758
 
1.5%
Close Punctuation 65758
 
1.5%
Dash Punctuation 2190
 
< 0.1%
Decimal Number 21
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 305186
11.8%
a 299390
11.6%
e 290122
11.2%
o 240216
9.3%
r 229316
8.9%
i 173792
 
6.7%
n 169878
 
6.6%
l 136877
 
5.3%
d 115279
 
4.5%
s 95763
 
3.7%
Other values (25) 532181
20.6%
Uppercase Letter
ValueCountFrequency (%)
S 116412
13.3%
M 90630
 
10.3%
N 79769
 
9.1%
B 56916
 
6.5%
C 54340
 
6.2%
L 51918
 
5.9%
D 47335
 
5.4%
J 42565
 
4.8%
W 40161
 
4.6%
G 38221
 
4.4%
Other values (17) 260095
29.6%
Decimal Number
ValueCountFrequency (%)
1 8
38.1%
5 5
23.8%
0 2
 
9.5%
2 2
 
9.5%
6 2
 
9.5%
9 1
 
4.8%
3 1
 
4.8%
Other Punctuation
ValueCountFrequency (%)
. 355587
87.7%
& 39874
 
9.8%
, 9364
 
2.3%
' 622
 
0.2%
? 16
 
< 0.1%
/ 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 65747
> 99.9%
( 10
 
< 0.1%
{ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 65747
> 99.9%
) 10
 
< 0.1%
} 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
512462
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2190
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3466362
76.7%
Common 1051653
 
23.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 305186
 
8.8%
a 299390
 
8.6%
e 290122
 
8.4%
o 240216
 
6.9%
r 229316
 
6.6%
i 173792
 
5.0%
n 169878
 
4.9%
l 136877
 
3.9%
S 116412
 
3.4%
d 115279
 
3.3%
Other values (52) 1389894
40.1%
Common
ValueCountFrequency (%)
512462
48.7%
. 355587
33.8%
[ 65747
 
6.3%
] 65747
 
6.3%
& 39874
 
3.8%
, 9364
 
0.9%
- 2190
 
0.2%
' 622
 
0.1%
? 16
 
< 0.1%
) 10
 
< 0.1%
Other values (11) 34
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4517525
> 99.9%
None 490
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
512462
 
11.3%
. 355587
 
7.9%
t 305186
 
6.8%
a 299390
 
6.6%
e 290122
 
6.4%
o 240216
 
5.3%
r 229316
 
5.1%
i 173792
 
3.8%
n 169878
 
3.8%
l 136877
 
3.0%
Other values (63) 1804699
39.9%
None
ValueCountFrequency (%)
ñ 238
48.6%
ü 107
21.8%
á 95
 
19.4%
ä 13
 
2.7%
é 12
 
2.4%
ö 12
 
2.4%
ó 8
 
1.6%
Á 2
 
0.4%
č 2
 
0.4%
â 1
 
0.2%
Distinct941
Distinct (%)0.2%
Missing3136
Missing (%)0.5%
Memory size4.6 MiB
2025-01-14T11:40:18.611615image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length1
Mean length1.044863228
Min length1

Characters and Unicode

Total characters628573
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique393 ?
Unique (%)0.1%

Sample

1st row7
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 548305
91.1%
2 10273
 
1.7%
3 6619
 
1.1%
4 4295
 
0.7%
5 2621
 
0.4%
6 2340
 
0.4%
7 1822
 
0.3%
8 1527
 
0.3%
10 1306
 
0.2%
9 1254
 
0.2%
Other values (931) 21222
 
3.5%
2025-01-14T11:40:18.862506image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 560888
89.2%
2 17645
 
2.8%
3 11801
 
1.9%
4 8337
 
1.3%
5 6511
 
1.0%
0 6142
 
1.0%
6 5349
 
0.9%
7 4420
 
0.7%
8 3992
 
0.6%
9 3488
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 628573
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 560888
89.2%
2 17645
 
2.8%
3 11801
 
1.9%
4 8337
 
1.3%
5 6511
 
1.0%
0 6142
 
1.0%
6 5349
 
0.9%
7 4420
 
0.7%
8 3992
 
0.6%
9 3488
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Common 628573
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 560888
89.2%
2 17645
 
2.8%
3 11801
 
1.9%
4 8337
 
1.3%
5 6511
 
1.0%
0 6142
 
1.0%
6 5349
 
0.9%
7 4420
 
0.7%
8 3992
 
0.6%
9 3488
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 628573
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 560888
89.2%
2 17645
 
2.8%
3 11801
 
1.9%
4 8337
 
1.3%
5 6511
 
1.0%
0 6142
 
1.0%
6 5349
 
0.9%
7 4420
 
0.7%
8 3992
 
0.6%
9 3488
 
0.6%

sex
Text

Missing 

Distinct95
Distinct (%)< 0.1%
Missing339511
Missing (%)56.1%
Memory size4.6 MiB
2025-01-14T11:40:18.928685image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length40
Median length34
Mean length5.351737686
Min length4

Characters and Unicode

Total characters1419329
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24 ?
Unique (%)< 0.1%

Sample

1st rowWorker
2nd rowMale
3rd rowMale
4th rowMale
5th rowMale
ValueCountFrequency (%)
male 137835
50.2%
female 93225
34.0%
unknown 34039
 
12.4%
worker 7022
 
2.6%
1487
 
0.5%
unable 240
 
0.1%
to 240
 
0.1%
determine 240
 
0.1%
2025-01-14T11:40:19.059680image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 332267
23.4%
l 231300
16.3%
a 231300
16.3%
M 120716
 
8.5%
m 110584
 
7.8%
n 102597
 
7.2%
F 80595
 
5.7%
o 41301
 
2.9%
k 41061
 
2.9%
U 34224
 
2.4%
Other values (13) 93384
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1152779
81.2%
Uppercase Letter 242396
 
17.1%
Other Punctuation 15035
 
1.1%
Space Separator 9119
 
0.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 332267
28.8%
l 231300
20.1%
a 231300
20.1%
m 110584
 
9.6%
n 102597
 
8.9%
o 41301
 
3.6%
k 41061
 
3.6%
w 34200
 
3.0%
r 14284
 
1.2%
f 12630
 
1.1%
Other values (5) 1255
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
M 120716
49.8%
F 80595
33.2%
U 34224
 
14.1%
W 6861
 
2.8%
Other Punctuation
ValueCountFrequency (%)
; 13780
91.7%
& 1253
 
8.3%
, 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
9119
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1395175
98.3%
Common 24154
 
1.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 332267
23.8%
l 231300
16.6%
a 231300
16.6%
M 120716
 
8.7%
m 110584
 
7.9%
n 102597
 
7.4%
F 80595
 
5.8%
o 41301
 
3.0%
k 41061
 
2.9%
U 34224
 
2.5%
Other values (9) 69230
 
5.0%
Common
ValueCountFrequency (%)
; 13780
57.1%
9119
37.8%
& 1253
 
5.2%
, 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1419329
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 332267
23.4%
l 231300
16.3%
a 231300
16.3%
M 120716
 
8.5%
m 110584
 
7.8%
n 102597
 
7.2%
F 80595
 
5.7%
o 41301
 
2.9%
k 41061
 
2.9%
U 34224
 
2.4%
Other values (13) 93384
 
6.6%

lifeStage
Text

Missing 

Distinct178
Distinct (%)< 0.1%
Missing174155
Missing (%)28.8%
Memory size4.6 MiB
2025-01-14T11:40:19.124110image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length59
Median length5
Mean length5.285092843
Min length1

Characters and Unicode

Total characters2275576
Distinct characters45
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique60 ?
Unique (%)< 0.1%

Sample

1st rowAdult
2nd rowAdult
3rd rowAdult
4th rowAdult
5th rowAdult
ValueCountFrequency (%)
adult 425078
95.7%
immature 4871
 
1.1%
wings 3368
 
0.8%
alate 1659
 
0.4%
apterous 1572
 
0.4%
pupa 1198
 
0.3%
soldier 1080
 
0.2%
worker 1007
 
0.2%
larva 943
 
0.2%
reproductive 667
 
0.2%
Other values (46) 2928
 
0.7%
2025-01-14T11:40:19.270432image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 434662
19.1%
u 434253
19.1%
l 428547
18.8%
d 426929
18.8%
A 392238
17.2%
a 47160
 
2.1%
13806
 
0.6%
e 13577
 
0.6%
r 11681
 
0.5%
m 10485
 
0.5%
Other values (35) 62238
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1844501
81.1%
Uppercase Letter 406785
 
17.9%
Space Separator 13806
 
0.6%
Other Punctuation 10463
 
0.5%
Open Punctuation 10
 
< 0.1%
Close Punctuation 10
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 434662
23.6%
u 434253
23.5%
l 428547
23.2%
d 426929
23.1%
a 47160
 
2.6%
e 13577
 
0.7%
r 11681
 
0.6%
m 10485
 
0.6%
i 6614
 
0.4%
n 5371
 
0.3%
Other values (12) 25222
 
1.4%
Uppercase Letter
ValueCountFrequency (%)
A 392238
96.4%
I 4857
 
1.2%
W 4377
 
1.1%
P 1253
 
0.3%
S 1098
 
0.3%
L 809
 
0.2%
R 668
 
0.2%
U 667
 
0.2%
N 399
 
0.1%
T 177
 
< 0.1%
Other values (8) 242
 
0.1%
Space Separator
ValueCountFrequency (%)
13806
100.0%
Other Punctuation
ValueCountFrequency (%)
; 10463
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 10
100.0%
Close Punctuation
ValueCountFrequency (%)
] 10
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2251286
98.9%
Common 24290
 
1.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 434662
19.3%
u 434253
19.3%
l 428547
19.0%
d 426929
19.0%
A 392238
17.4%
a 47160
 
2.1%
e 13577
 
0.6%
r 11681
 
0.5%
m 10485
 
0.5%
i 6614
 
0.3%
Other values (30) 45140
 
2.0%
Common
ValueCountFrequency (%)
13806
56.8%
; 10463
43.1%
[ 10
 
< 0.1%
] 10
 
< 0.1%
- 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2275576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 434662
19.1%
u 434253
19.1%
l 428547
18.8%
d 426929
18.8%
A 392238
17.2%
a 47160
 
2.1%
13806
 
0.6%
e 13577
 
0.6%
r 11681
 
0.5%
m 10485
 
0.5%
Other values (35) 62238
 
2.7%

preparations
Text

Missing 

Distinct272
Distinct (%)< 0.1%
Missing42056
Missing (%)7.0%
Memory size4.6 MiB
2025-01-14T11:40:19.337568image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length93
Median length6
Mean length6.839828032
Min length1

Characters and Unicode

Total characters3848525
Distinct characters58
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique112 ?
Unique (%)< 0.1%

Sample

1st rowPinned
2nd rowPinned
3rd rowPinned
4th rowEnvelope
5th rowPinned
ValueCountFrequency (%)
pinned 389792
63.9%
envelope 114691
 
18.8%
slide 65067
 
10.7%
vial 9498
 
1.6%
ethanol 6482
 
1.1%
section 3747
 
0.6%
on 3653
 
0.6%
3195
 
0.5%
ink 3151
 
0.5%
pen 3072
 
0.5%
Other values (93) 7800
 
1.3%
2025-01-14T11:40:19.478468image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 916570
23.8%
e 701224
18.2%
i 472718
12.3%
d 455956
11.8%
P 366246
 
9.5%
l 199786
 
5.2%
p 142808
 
3.7%
o 133897
 
3.5%
v 114853
 
3.0%
E 112904
 
2.9%
Other values (48) 231563
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3214548
83.5%
Uppercase Letter 553430
 
14.4%
Space Separator 47484
 
1.2%
Other Punctuation 32282
 
0.8%
Decimal Number 781
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 916570
28.5%
e 701224
21.8%
i 472718
14.7%
d 455956
14.2%
l 199786
 
6.2%
p 142808
 
4.4%
o 133897
 
4.2%
v 114853
 
3.6%
a 18598
 
0.6%
s 17527
 
0.5%
Other values (15) 40611
 
1.3%
Uppercase Letter
ValueCountFrequency (%)
P 366246
66.2%
E 112904
 
20.4%
S 56010
 
10.1%
V 9718
 
1.8%
I 3164
 
0.6%
B 2575
 
0.5%
R 887
 
0.2%
M 523
 
0.1%
C 505
 
0.1%
D 388
 
0.1%
Other values (10) 510
 
0.1%
Other Punctuation
ValueCountFrequency (%)
; 28582
88.5%
& 3195
 
9.9%
% 389
 
1.2%
. 69
 
0.2%
, 28
 
0.1%
/ 15
 
< 0.1%
? 4
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
5 389
49.8%
7 389
49.8%
2 1
 
0.1%
3 1
 
0.1%
9 1
 
0.1%
Space Separator
ValueCountFrequency (%)
47484
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3767978
97.9%
Common 80547
 
2.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 916570
24.3%
e 701224
18.6%
i 472718
12.5%
d 455956
12.1%
P 366246
 
9.7%
l 199786
 
5.3%
p 142808
 
3.8%
o 133897
 
3.6%
v 114853
 
3.0%
E 112904
 
3.0%
Other values (35) 151016
 
4.0%
Common
ValueCountFrequency (%)
47484
59.0%
; 28582
35.5%
& 3195
 
4.0%
5 389
 
0.5%
% 389
 
0.5%
7 389
 
0.5%
. 69
 
0.1%
, 28
 
< 0.1%
/ 15
 
< 0.1%
? 4
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3848525
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 916570
23.8%
e 701224
18.2%
i 472718
12.3%
d 455956
11.8%
P 366246
 
9.5%
l 199786
 
5.2%
p 142808
 
3.7%
o 133897
 
3.5%
v 114853
 
3.0%
E 112904
 
2.9%
Other values (48) 231563
 
6.0%

associatedMedia
Text

Missing 

Distinct214407
Distinct (%)99.9%
Missing390092
Missing (%)64.5%
Memory size4.6 MiB
2025-01-14T11:40:19.706154image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length259
Median length49
Mean length52.23455467
Min length48

Characters and Unicode

Total characters11210998
Distinct characters31
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique214268 ?
Unique (%)99.8%

Sample

1st rowhttps://collections.nmnh.si.edu/media/?i=16421668
2nd rowhttps://collections.nmnh.si.edu/media/?i=16411146
3rd rowhttps://collections.nmnh.si.edu/media/?i=16342640
4th rowhttps://collections.nmnh.si.edu/media/?i=16365128
5th rowhttps://collections.nmnh.si.edu/media/?i=16326001
ValueCountFrequency (%)
https://collections.nmnh.si.edu/media/?i=16612365 38
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=16556913 19
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=16558066 14
 
< 0.1%
16556913 12
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=16623013 10
 
< 0.1%
16574611 9
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=16945972 7
 
< 0.1%
16561531 7
 
< 0.1%
16556901 7
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=16947492 5
 
< 0.1%
Other values (284058) 287167
> 99.9%
2025-01-14T11:40:20.062597image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 858512
 
7.7%
/ 858512
 
7.7%
e 643884
 
5.7%
t 643884
 
5.7%
s 643884
 
5.7%
. 643884
 
5.7%
n 643884
 
5.7%
1 468009
 
4.2%
l 429256
 
3.8%
o 429256
 
3.8%
Other values (21) 4948033
44.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6653468
59.3%
Decimal Number 2265916
 
20.2%
Other Punctuation 2004319
 
17.9%
Math Symbol 214628
 
1.9%
Space Separator 72667
 
0.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 858512
12.9%
e 643884
9.7%
t 643884
9.7%
s 643884
9.7%
n 643884
9.7%
l 429256
 
6.5%
o 429256
 
6.5%
c 429256
 
6.5%
m 429256
 
6.5%
d 429256
 
6.5%
Other values (4) 1073140
16.1%
Decimal Number
ValueCountFrequency (%)
1 468009
20.7%
6 265288
11.7%
3 251481
11.1%
4 215243
9.5%
0 211833
9.3%
9 201543
8.9%
7 178897
 
7.9%
2 166354
 
7.3%
5 164303
 
7.3%
8 142965
 
6.3%
Other Punctuation
ValueCountFrequency (%)
/ 858512
42.8%
. 643884
32.1%
? 214628
 
10.7%
: 214628
 
10.7%
; 72667
 
3.6%
Math Symbol
ValueCountFrequency (%)
= 214628
100.0%
Space Separator
ValueCountFrequency (%)
72667
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6653468
59.3%
Common 4557530
40.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 858512
18.8%
. 643884
14.1%
1 468009
10.3%
6 265288
 
5.8%
3 251481
 
5.5%
4 215243
 
4.7%
= 214628
 
4.7%
? 214628
 
4.7%
: 214628
 
4.7%
0 211833
 
4.6%
Other values (7) 999396
21.9%
Latin
ValueCountFrequency (%)
i 858512
12.9%
e 643884
9.7%
t 643884
9.7%
s 643884
9.7%
n 643884
9.7%
l 429256
 
6.5%
o 429256
 
6.5%
c 429256
 
6.5%
m 429256
 
6.5%
d 429256
 
6.5%
Other values (4) 1073140
16.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11210998
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 858512
 
7.7%
/ 858512
 
7.7%
e 643884
 
5.7%
t 643884
 
5.7%
s 643884
 
5.7%
. 643884
 
5.7%
n 643884
 
5.7%
1 468009
 
4.2%
l 429256
 
3.8%
o 429256
 
3.8%
Other values (21) 4948033
44.1%

occurrenceRemarks
Text

Missing 

Distinct31235
Distinct (%)21.5%
Missing459346
Missing (%)76.0%
Memory size4.6 MiB
2025-01-14T11:40:20.300633image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length151446
Median length89176
Mean length77.46756641
Min length1

Characters and Unicode

Total characters11261770
Distinct characters120
Distinct categories17 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27503 ?
Unique (%)18.9%

Sample

1st rowOne leg removed for genetic sampling while on loan to GUELPH.
2nd rowLindroth, 1975:125: (the loc. is no doubt wrong).
3rd rowF. Monros Coll. 1959 G.M. Greene Coll. C. Schaeffer Coll. Shoemaker Coll. 1956 A. Nicolay Coll. 1950 L.W. Saylor Coll.
4th rowSpecimen data is incomplete. Phase 1 of data capture inlcluded USNMENT#s and general locality.
5th rowOne leg removed for genetic sampling while on loan to GUELPH.
ValueCountFrequency (%)
digitization 56218
 
3.4%
by 48162
 
2.9%
digital 44075
 
2.7%
transcribed 44039
 
2.7%
volunteers 44039
 
2.7%
of 42600
 
2.6%
on 41034
 
2.5%
to 36795
 
2.2%
loan 36495
 
2.2%
for 36258
 
2.2%
Other values (46961) 1230406
74.1%
2025-01-14T11:40:20.625754image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1496225
 
13.3%
e 833183
 
7.4%
i 803329
 
7.1%
a 671754
 
6.0%
t 666687
 
5.9%
o 651617
 
5.8%
n 613739
 
5.4%
r 553298
 
4.9%
s 447996
 
4.0%
l 427415
 
3.8%
Other values (110) 4096527
36.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7971658
70.8%
Space Separator 1496225
 
13.3%
Uppercase Letter 1027865
 
9.1%
Other Punctuation 295394
 
2.6%
Decimal Number 259592
 
2.3%
Control 101184
 
0.9%
Open Punctuation 39698
 
0.4%
Close Punctuation 39677
 
0.4%
Dash Punctuation 18103
 
0.2%
Math Symbol 12131
 
0.1%
Other values (7) 243
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 833183
10.5%
i 803329
10.1%
a 671754
 
8.4%
t 666687
 
8.4%
o 651617
 
8.2%
n 613739
 
7.7%
r 553298
 
6.9%
s 447996
 
5.6%
l 427415
 
5.4%
d 318849
 
4.0%
Other values (26) 1983791
24.9%
Uppercase Letter
ValueCountFrequency (%)
P 116768
11.4%
O 101082
 
9.8%
S 101033
 
9.8%
E 82579
 
8.0%
D 70631
 
6.9%
I 64203
 
6.2%
T 62905
 
6.1%
M 62182
 
6.0%
U 54826
 
5.3%
L 50803
 
4.9%
Other values (19) 260853
25.4%
Other Punctuation
ValueCountFrequency (%)
. 173902
58.9%
; 47091
 
15.9%
, 31645
 
10.7%
: 15772
 
5.3%
# 9313
 
3.2%
/ 7124
 
2.4%
' 5317
 
1.8%
" 2639
 
0.9%
& 1678
 
0.6%
? 818
 
0.3%
Other values (7) 95
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 55149
21.2%
9 38333
14.8%
0 29746
11.5%
2 27054
10.4%
6 20186
 
7.8%
3 19885
 
7.7%
5 19160
 
7.4%
8 17051
 
6.6%
7 16622
 
6.4%
4 16406
 
6.3%
Math Symbol
ValueCountFrequency (%)
| 10552
87.0%
= 836
 
6.9%
+ 720
 
5.9%
> 11
 
0.1%
~ 8
 
0.1%
< 4
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
° 17
47.2%
14
38.9%
4
 
11.1%
© 1
 
2.8%
Open Punctuation
ValueCountFrequency (%)
( 33578
84.6%
[ 6109
 
15.4%
{ 11
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 33570
84.6%
] 6096
 
15.4%
} 11
 
< 0.1%
Control
ValueCountFrequency (%)
100652
99.5%
532
 
0.5%
Dash Punctuation
ValueCountFrequency (%)
- 18102
> 99.9%
1
 
< 0.1%
Currency Symbol
ValueCountFrequency (%)
$ 1
50.0%
£ 1
50.0%
Space Separator
ValueCountFrequency (%)
1496225
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 149
100.0%
Initial Punctuation
ValueCountFrequency (%)
23
100.0%
Final Punctuation
ValueCountFrequency (%)
23
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 9
100.0%
Modifier Letter
ValueCountFrequency (%)
ʼ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8999521
79.9%
Common 2262249
 
20.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 833183
 
9.3%
i 803329
 
8.9%
a 671754
 
7.5%
t 666687
 
7.4%
o 651617
 
7.2%
n 613739
 
6.8%
r 553298
 
6.1%
s 447996
 
5.0%
l 427415
 
4.7%
d 318849
 
3.5%
Other values (54) 3011654
33.5%
Common
ValueCountFrequency (%)
1496225
66.1%
. 173902
 
7.7%
100652
 
4.4%
1 55149
 
2.4%
; 47091
 
2.1%
9 38333
 
1.7%
( 33578
 
1.5%
) 33570
 
1.5%
, 31645
 
1.4%
0 29746
 
1.3%
Other values (46) 222358
 
9.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11261640
> 99.9%
None 62
 
< 0.1%
Punctuation 49
 
< 0.1%
Misc Symbols 18
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1496225
 
13.3%
e 833183
 
7.4%
i 803329
 
7.1%
a 671754
 
6.0%
t 666687
 
5.9%
o 651617
 
5.8%
n 613739
 
5.4%
r 553298
 
4.9%
s 447996
 
4.0%
l 427415
 
3.8%
Other values (85) 4096397
36.4%
Punctuation
ValueCountFrequency (%)
23
46.9%
23
46.9%
2
 
4.1%
1
 
2.0%
None
ValueCountFrequency (%)
° 17
27.4%
· 7
11.3%
á 6
 
9.7%
é 4
 
6.5%
ó 4
 
6.5%
ö 4
 
6.5%
ø 3
 
4.8%
í 3
 
4.8%
µ 2
 
3.2%
ü 2
 
3.2%
Other values (8) 10
16.1%
Misc Symbols
ValueCountFrequency (%)
14
77.8%
4
 
22.2%
Modifier Letters
ValueCountFrequency (%)
ʼ 1
100.0%

organismID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604719
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:20.682748image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters9
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row70 21'9"W
ValueCountFrequency (%)
70 1
50.0%
21'9"w 1
50.0%
2025-01-14T11:40:20.781823image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 1
11.1%
0 1
11.1%
1
11.1%
2 1
11.1%
1 1
11.1%
' 1
11.1%
9 1
11.1%
" 1
11.1%
W 1
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5
55.6%
Other Punctuation 2
 
22.2%
Space Separator 1
 
11.1%
Uppercase Letter 1
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 1
20.0%
0 1
20.0%
2 1
20.0%
1 1
20.0%
9 1
20.0%
Other Punctuation
ValueCountFrequency (%)
' 1
50.0%
" 1
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Uppercase Letter
ValueCountFrequency (%)
W 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
88.9%
Latin 1
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
7 1
12.5%
0 1
12.5%
1
12.5%
2 1
12.5%
1 1
12.5%
' 1
12.5%
9 1
12.5%
" 1
12.5%
Latin
ValueCountFrequency (%)
W 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 1
11.1%
0 1
11.1%
1
11.1%
2 1
11.1%
1 1
11.1%
' 1
11.1%
9 1
11.1%
" 1
11.1%
W 1
11.1%

eventType
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604719
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:20.826463image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters6
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row-11.7815
ValueCountFrequency (%)
11.7815 1
100.0%
2025-01-14T11:40:20.924273image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 3
37.5%
- 1
 
12.5%
. 1
 
12.5%
7 1
 
12.5%
8 1
 
12.5%
5 1
 
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
75.0%
Dash Punctuation 1
 
12.5%
Other Punctuation 1
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 3
50.0%
7 1
 
16.7%
8 1
 
16.7%
5 1
 
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 3
37.5%
- 1
 
12.5%
. 1
 
12.5%
7 1
 
12.5%
8 1
 
12.5%
5 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3
37.5%
- 1
 
12.5%
. 1
 
12.5%
7 1
 
12.5%
8 1
 
12.5%
5 1
 
12.5%

fieldNumber
Text

Missing 

Distinct3093
Distinct (%)72.7%
Missing600468
Missing (%)99.3%
Memory size4.6 MiB
2025-01-14T11:40:21.113934image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length29
Mean length9.591251176
Min length1

Characters and Unicode

Total characters40782
Distinct characters70
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2648 ?
Unique (%)62.3%

Sample

1st rowBBB991
2nd rowBBB642-DERM
3rd row1653
4th rowJSL021109-18
5th rowCOL-8-101
ValueCountFrequency (%)
1653 128
 
2.8%
2 46
 
1.0%
bbb899-hym 34
 
0.7%
1 32
 
0.7%
bbb791-hym 26
 
0.6%
bbb749-hym 23
 
0.5%
759-8 22
 
0.5%
tub 20
 
0.4%
tank 18
 
0.4%
9 18
 
0.4%
Other values (3089) 4227
92.0%
2025-01-14T11:40:21.390628image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
B 4784
 
11.7%
0 3997
 
9.8%
- 3980
 
9.8%
1 3402
 
8.3%
2 2239
 
5.5%
3 1558
 
3.8%
6 1542
 
3.8%
7 1514
 
3.7%
4 1498
 
3.7%
9 1482
 
3.6%
Other values (60) 14786
36.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19503
47.8%
Uppercase Letter 15056
36.9%
Dash Punctuation 3980
 
9.8%
Lowercase Letter 1242
 
3.0%
Other Punctuation 655
 
1.6%
Space Separator 342
 
0.8%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B 4784
31.8%
S 1388
 
9.2%
T 1136
 
7.5%
C 792
 
5.3%
M 764
 
5.1%
A 708
 
4.7%
L 667
 
4.4%
R 639
 
4.2%
N 583
 
3.9%
H 533
 
3.5%
Other values (15) 3062
20.3%
Lowercase Letter
ValueCountFrequency (%)
e 146
11.8%
a 138
11.1%
o 134
10.8%
t 118
 
9.5%
b 82
 
6.6%
n 81
 
6.5%
r 67
 
5.4%
m 57
 
4.6%
c 57
 
4.6%
i 55
 
4.4%
Other values (13) 307
24.7%
Decimal Number
ValueCountFrequency (%)
0 3997
20.5%
1 3402
17.4%
2 2239
11.5%
3 1558
 
8.0%
6 1542
 
7.9%
7 1514
 
7.8%
4 1498
 
7.7%
9 1482
 
7.6%
5 1174
 
6.0%
8 1097
 
5.6%
Other Punctuation
ValueCountFrequency (%)
# 344
52.5%
. 200
30.5%
; 93
 
14.2%
, 10
 
1.5%
' 3
 
0.5%
" 3
 
0.5%
/ 1
 
0.2%
: 1
 
0.2%
Dash Punctuation
ValueCountFrequency (%)
- 3980
100.0%
Space Separator
ValueCountFrequency (%)
342
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 24484
60.0%
Latin 16298
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
B 4784
29.4%
S 1388
 
8.5%
T 1136
 
7.0%
C 792
 
4.9%
M 764
 
4.7%
A 708
 
4.3%
L 667
 
4.1%
R 639
 
3.9%
N 583
 
3.6%
H 533
 
3.3%
Other values (38) 4304
26.4%
Common
ValueCountFrequency (%)
0 3997
16.3%
- 3980
16.3%
1 3402
13.9%
2 2239
9.1%
3 1558
 
6.4%
6 1542
 
6.3%
7 1514
 
6.2%
4 1498
 
6.1%
9 1482
 
6.1%
5 1174
 
4.8%
Other values (12) 2098
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 40782
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B 4784
 
11.7%
0 3997
 
9.8%
- 3980
 
9.8%
1 3402
 
8.3%
2 2239
 
5.5%
3 1558
 
3.8%
6 1542
 
3.8%
7 1514
 
3.7%
4 1498
 
3.7%
9 1482
 
3.6%
Other values (60) 14786
36.3%

eventDate
Text

Missing 

Distinct46148
Distinct (%)12.6%
Missing239420
Missing (%)39.6%
Memory size4.6 MiB
2025-01-14T11:40:21.611117image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length10
Mean length11.06884752
Min length4

Characters and Unicode

Total characters4043450
Distinct characters16
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13232 ?
Unique (%)3.6%

Sample

1st row1967-06-20
2nd row1914-07
3rd row2005-08-02
4th row1964-04-25
5th row1971-08-22
ValueCountFrequency (%)
1998-07-26 709
 
0.2%
1938 574
 
0.2%
2006-06-24 544
 
0.1%
1933 524
 
0.1%
1960-06-30 506
 
0.1%
1936 472
 
0.1%
1927-07-10 469
 
0.1%
1964-08-01/1964-08-31 449
 
0.1%
1930 435
 
0.1%
1966-06-23 407
 
0.1%
Other values (46130) 360245
98.6%
2025-01-14T11:40:21.901592image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 782638
19.4%
1 702348
17.4%
0 653087
16.2%
9 492965
12.2%
2 288127
 
7.1%
6 225522
 
5.6%
7 216691
 
5.4%
8 183523
 
4.5%
5 159737
 
4.0%
3 155914
 
3.9%
Other values (6) 182898
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3213904
79.5%
Dash Punctuation 782638
 
19.4%
Other Punctuation 46840
 
1.2%
Space Separator 34
 
< 0.1%
Lowercase Letter 34
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 702348
21.9%
0 653087
20.3%
9 492965
15.3%
2 288127
9.0%
6 225522
 
7.0%
7 216691
 
6.7%
8 183523
 
5.7%
5 159737
 
5.0%
3 155914
 
4.9%
4 135990
 
4.2%
Other Punctuation
ValueCountFrequency (%)
/ 46785
99.9%
, 55
 
0.1%
Lowercase Letter
ValueCountFrequency (%)
o 17
50.0%
r 17
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 782638
100.0%
Space Separator
ValueCountFrequency (%)
34
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4043416
> 99.9%
Latin 34
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 782638
19.4%
1 702348
17.4%
0 653087
16.2%
9 492965
12.2%
2 288127
 
7.1%
6 225522
 
5.6%
7 216691
 
5.4%
8 183523
 
4.5%
5 159737
 
4.0%
3 155914
 
3.9%
Other values (4) 182864
 
4.5%
Latin
ValueCountFrequency (%)
o 17
50.0%
r 17
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4043450
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 782638
19.4%
1 702348
17.4%
0 653087
16.2%
9 492965
12.2%
2 288127
 
7.1%
6 225522
 
5.6%
7 216691
 
5.4%
8 183523
 
4.5%
5 159737
 
4.0%
3 155914
 
3.9%
Other values (6) 182898
 
4.5%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing244789
Missing (%)40.5%
Memory size4.6 MiB
2025-01-14T11:40:22.120591image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.849043289
Min length1

Characters and Unicode

Total characters1025459
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row171
2nd row212
3rd row214
4th row116
5th row234
ValueCountFrequency (%)
212 4210
 
1.2%
213 4014
 
1.1%
182 3947
 
1.1%
181 3445
 
1.0%
151 3112
 
0.9%
152 2941
 
0.8%
183 2913
 
0.8%
191 2887
 
0.8%
207 2741
 
0.8%
178 2632
 
0.7%
Other values (356) 327089
90.9%
2025-01-14T11:40:22.489925image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 238127
23.2%
2 202525
19.7%
3 100701
9.8%
9 70985
 
6.9%
0 70947
 
6.9%
4 70160
 
6.8%
5 69390
 
6.8%
8 68439
 
6.7%
6 67900
 
6.6%
7 66285
 
6.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1025459
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 238127
23.2%
2 202525
19.7%
3 100701
9.8%
9 70985
 
6.9%
0 70947
 
6.9%
4 70160
 
6.8%
5 69390
 
6.8%
8 68439
 
6.7%
6 67900
 
6.6%
7 66285
 
6.5%

Most occurring scripts

ValueCountFrequency (%)
Common 1025459
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 238127
23.2%
2 202525
19.7%
3 100701
9.8%
9 70985
 
6.9%
0 70947
 
6.9%
4 70160
 
6.8%
5 69390
 
6.8%
8 68439
 
6.7%
6 67900
 
6.6%
7 66285
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1025459
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 238127
23.2%
2 202525
19.7%
3 100701
9.8%
9 70985
 
6.9%
0 70947
 
6.9%
4 70160
 
6.8%
5 69390
 
6.8%
8 68439
 
6.7%
6 67900
 
6.6%
7 66285
 
6.5%

endDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing244303
Missing (%)40.4%
Memory size4.6 MiB
2025-01-14T11:40:22.720343image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.857215392
Min length1

Characters and Unicode

Total characters1029789
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row171
2nd row212
3rd row214
4th row116
5th row234
ValueCountFrequency (%)
212 4994
 
1.4%
181 4276
 
1.2%
213 3666
 
1.0%
151 3533
 
1.0%
182 3365
 
0.9%
243 3191
 
0.9%
207 2999
 
0.8%
191 2952
 
0.8%
197 2774
 
0.8%
120 2623
 
0.7%
Other values (356) 326044
90.5%
2025-01-14T11:40:23.012011image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 236813
23.0%
2 202617
19.7%
3 102147
9.9%
0 72053
 
7.0%
9 71767
 
7.0%
4 70263
 
6.8%
5 69867
 
6.8%
6 68660
 
6.7%
7 67928
 
6.6%
8 67674
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1029789
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 236813
23.0%
2 202617
19.7%
3 102147
9.9%
0 72053
 
7.0%
9 71767
 
7.0%
4 70263
 
6.8%
5 69867
 
6.8%
6 68660
 
6.7%
7 67928
 
6.6%
8 67674
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Common 1029789
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 236813
23.0%
2 202617
19.7%
3 102147
9.9%
0 72053
 
7.0%
9 71767
 
7.0%
4 70263
 
6.8%
5 69867
 
6.8%
6 68660
 
6.7%
7 67928
 
6.6%
8 67674
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1029789
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 236813
23.0%
2 202617
19.7%
3 102147
9.9%
0 72053
 
7.0%
9 71767
 
7.0%
4 70263
 
6.8%
5 69867
 
6.8%
6 68660
 
6.7%
7 67928
 
6.6%
8 67674
 
6.6%

year
Text

Missing 

Distinct191
Distinct (%)0.1%
Missing239420
Missing (%)39.6%
Memory size4.6 MiB
2025-01-14T11:40:23.196121image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1461200
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)< 0.1%

Sample

1st row1967
2nd row1914
3rd row2005
4th row1964
5th row1971
ValueCountFrequency (%)
1966 12313
 
3.4%
1968 9194
 
2.5%
1971 8970
 
2.5%
1967 8361
 
2.3%
1965 7882
 
2.2%
1972 6275
 
1.7%
1964 6152
 
1.7%
1974 6096
 
1.7%
1973 6078
 
1.7%
1963 5563
 
1.5%
Other values (181) 288416
79.0%
2025-01-14T11:40:23.434642image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 398215
27.3%
9 381833
26.1%
6 108781
 
7.4%
0 108045
 
7.4%
2 92925
 
6.4%
7 89271
 
6.1%
8 74906
 
5.1%
5 72462
 
5.0%
3 69682
 
4.8%
4 65080
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1461200
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 398215
27.3%
9 381833
26.1%
6 108781
 
7.4%
0 108045
 
7.4%
2 92925
 
6.4%
7 89271
 
6.1%
8 74906
 
5.1%
5 72462
 
5.0%
3 69682
 
4.8%
4 65080
 
4.5%

Most occurring scripts

ValueCountFrequency (%)
Common 1461200
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 398215
27.3%
9 381833
26.1%
6 108781
 
7.4%
0 108045
 
7.4%
2 92925
 
6.4%
7 89271
 
6.1%
8 74906
 
5.1%
5 72462
 
5.0%
3 69682
 
4.8%
4 65080
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1461200
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 398215
27.3%
9 381833
26.1%
6 108781
 
7.4%
0 108045
 
7.4%
2 92925
 
6.4%
7 89271
 
6.1%
8 74906
 
5.1%
5 72462
 
5.0%
3 69682
 
4.8%
4 65080
 
4.5%

month
Text

Missing 

Distinct13
Distinct (%)< 0.1%
Missing246636
Missing (%)40.8%
Memory size4.6 MiB
2025-01-14T11:40:23.498607image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length1
Mean length1.113249964
Min length1

Characters and Unicode

Total characters398637
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row6
2nd row7
3rd row8
4th row4
5th row8
ValueCountFrequency (%)
7 74085
20.7%
6 58953
16.5%
8 51938
14.5%
5 36241
10.1%
9 26043
 
7.3%
4 25759
 
7.2%
3 16892
 
4.7%
10 16541
 
4.6%
2 14421
 
4.0%
11 13740
 
3.8%
Other values (4) 23473
 
6.6%
2025-01-14T11:40:23.611094image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 74085
18.6%
1 67492
16.9%
6 58954
14.8%
8 51939
13.0%
5 36241
9.1%
9 26044
 
6.5%
4 25760
 
6.5%
2 24685
 
6.2%
3 16892
 
4.2%
0 16541
 
4.1%
Other values (3) 4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 398633
> 99.9%
Space Separator 2
 
< 0.1%
Other Punctuation 1
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 74085
18.6%
1 67492
16.9%
6 58954
14.8%
8 51939
13.0%
5 36241
9.1%
9 26044
 
6.5%
4 25760
 
6.5%
2 24685
 
6.2%
3 16892
 
4.2%
0 16541
 
4.1%
Space Separator
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%
Uppercase Letter
ValueCountFrequency (%)
S 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 398636
> 99.9%
Latin 1
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
7 74085
18.6%
1 67492
16.9%
6 58954
14.8%
8 51939
13.0%
5 36241
9.1%
9 26044
 
6.5%
4 25760
 
6.5%
2 24685
 
6.2%
3 16892
 
4.2%
0 16541
 
4.1%
Other values (2) 3
 
< 0.1%
Latin
ValueCountFrequency (%)
S 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 398637
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 74085
18.6%
1 67492
16.9%
6 58954
14.8%
8 51939
13.0%
5 36241
9.1%
9 26044
 
6.5%
4 25760
 
6.5%
2 24685
 
6.2%
3 16892
 
4.2%
0 16541
 
4.1%
Other values (3) 4
 
< 0.1%

day
Text

Missing 

Distinct32
Distinct (%)< 0.1%
Missing270887
Missing (%)44.8%
Memory size4.6 MiB
2025-01-14T11:40:23.679646image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length2
Mean length1.683176918
Min length1

Characters and Unicode

Total characters561900
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row20
2nd row2
3rd row25
4th row22
5th row6
ValueCountFrequency (%)
1 20555
 
6.2%
8 13029
 
3.9%
20 12179
 
3.6%
10 11989
 
3.6%
15 11884
 
3.6%
12 11876
 
3.6%
25 11249
 
3.4%
6 11148
 
3.3%
16 11145
 
3.3%
23 10866
 
3.3%
Other values (24) 207915
62.3%
2025-01-14T11:40:23.810720image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 155576
27.7%
2 137768
24.5%
3 45163
 
8.0%
8 33226
 
5.9%
0 33162
 
5.9%
5 33152
 
5.9%
6 32821
 
5.8%
4 31502
 
5.6%
7 30851
 
5.5%
9 28675
 
5.1%
Other values (3) 4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 561896
> 99.9%
Space Separator 2
 
< 0.1%
Other Punctuation 1
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 155576
27.7%
2 137768
24.5%
3 45163
 
8.0%
8 33226
 
5.9%
0 33162
 
5.9%
5 33152
 
5.9%
6 32821
 
5.8%
4 31502
 
5.6%
7 30851
 
5.5%
9 28675
 
5.1%
Space Separator
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%
Uppercase Letter
ValueCountFrequency (%)
W 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 561899
> 99.9%
Latin 1
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 155576
27.7%
2 137768
24.5%
3 45163
 
8.0%
8 33226
 
5.9%
0 33162
 
5.9%
5 33152
 
5.9%
6 32821
 
5.8%
4 31502
 
5.6%
7 30851
 
5.5%
9 28675
 
5.1%
Other values (2) 3
 
< 0.1%
Latin
ValueCountFrequency (%)
W 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 561900
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 155576
27.7%
2 137768
24.5%
3 45163
 
8.0%
8 33226
 
5.9%
0 33162
 
5.9%
5 33152
 
5.9%
6 32821
 
5.8%
4 31502
 
5.6%
7 30851
 
5.5%
9 28675
 
5.1%
Other values (3) 4
 
< 0.1%

verbatimEventDate
Text

Missing 

Distinct67999
Distinct (%)32.6%
Missing396366
Missing (%)65.5%
Memory size4.6 MiB
2025-01-14T11:40:24.024426image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length79
Median length71
Mean length10.59664321
Min length1

Characters and Unicode

Total characters2207853
Distinct characters92
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51583 ?
Unique (%)24.8%

Sample

1st row[Not Stated]
2nd row2-Aug-2005
3rd row[Not Stated]
4th row[Not Stated]
5th row9-IX-78
ValueCountFrequency (%)
not 32203
 
8.2%
stated 32171
 
8.2%
july 8707
 
2.2%
aug 7740
 
2.0%
june 7233
 
1.8%
may 5958
 
1.5%
1968 5763
 
1.5%
1971 5706
 
1.5%
1966 4507
 
1.1%
1972 2978
 
0.8%
Other values (37321) 279788
71.2%
2025-01-14T11:40:24.326835image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 217348
 
9.8%
184400
 
8.4%
9 146706
 
6.6%
- 127710
 
5.8%
2 112946
 
5.1%
t 105546
 
4.8%
I 88881
 
4.0%
6 79326
 
3.6%
0 76313
 
3.5%
. 64867
 
2.9%
Other values (82) 1003810
45.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 900919
40.8%
Lowercase Letter 464865
21.1%
Uppercase Letter 333421
 
15.1%
Space Separator 184400
 
8.4%
Other Punctuation 128799
 
5.8%
Dash Punctuation 127746
 
5.8%
Open Punctuation 33635
 
1.5%
Close Punctuation 33630
 
1.5%
Connector Punctuation 250
 
< 0.1%
Math Symbol 187
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 105546
22.7%
e 57954
12.5%
a 49169
10.6%
u 41267
 
8.9%
o 39668
 
8.5%
d 33270
 
7.2%
n 19822
 
4.3%
y 17901
 
3.9%
l 17064
 
3.7%
r 16877
 
3.6%
Other values (18) 66327
14.3%
Uppercase Letter
ValueCountFrequency (%)
I 88881
26.7%
V 43512
13.1%
N 38313
11.5%
S 36919
11.1%
J 33537
 
10.1%
A 23445
 
7.0%
M 13905
 
4.2%
X 9131
 
2.7%
U 7429
 
2.2%
E 5310
 
1.6%
Other values (17) 33039
 
9.9%
Other Punctuation
ValueCountFrequency (%)
. 64867
50.4%
, 34981
27.2%
/ 23005
 
17.9%
' 5024
 
3.9%
: 620
 
0.5%
? 141
 
0.1%
; 102
 
0.1%
& 38
 
< 0.1%
" 9
 
< 0.1%
# 6
 
< 0.1%
Other values (3) 6
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 217348
24.1%
9 146706
16.3%
2 112946
12.5%
6 79326
 
8.8%
0 76313
 
8.5%
7 63765
 
7.1%
3 54211
 
6.0%
8 53740
 
6.0%
5 48644
 
5.4%
4 47920
 
5.3%
Open Punctuation
ValueCountFrequency (%)
[ 33547
99.7%
( 82
 
0.2%
{ 6
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 33542
99.7%
) 82
 
0.2%
} 6
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
| 156
83.4%
+ 26
 
13.9%
= 5
 
2.7%
Dash Punctuation
ValueCountFrequency (%)
- 127710
> 99.9%
36
 
< 0.1%
Space Separator
ValueCountFrequency (%)
184400
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 250
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1409567
63.8%
Latin 798286
36.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 105546
13.2%
I 88881
 
11.1%
e 57954
 
7.3%
a 49169
 
6.2%
V 43512
 
5.5%
u 41267
 
5.2%
o 39668
 
5.0%
N 38313
 
4.8%
S 36919
 
4.6%
J 33537
 
4.2%
Other values (45) 263520
33.0%
Common
ValueCountFrequency (%)
1 217348
15.4%
184400
13.1%
9 146706
10.4%
- 127710
9.1%
2 112946
8.0%
6 79326
 
5.6%
0 76313
 
5.4%
. 64867
 
4.6%
7 63765
 
4.5%
3 54211
 
3.8%
Other values (27) 281975
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2207812
> 99.9%
Punctuation 37
 
< 0.1%
None 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 217348
 
9.8%
184400
 
8.4%
9 146706
 
6.6%
- 127710
 
5.8%
2 112946
 
5.1%
t 105546
 
4.8%
I 88881
 
4.0%
6 79326
 
3.6%
0 76313
 
3.5%
. 64867
 
2.9%
Other values (77) 1003769
45.5%
Punctuation
ValueCountFrequency (%)
36
97.3%
1
 
2.7%
None
ValueCountFrequency (%)
û 2
50.0%
Ç 1
25.0%
ÿ 1
25.0%

habitat
Text

Missing 

Distinct89
Distinct (%)44.7%
Missing604521
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:24.524404image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length103
Median length43
Mean length19.28643216
Min length5

Characters and Unicode

Total characters3838
Distinct characters62
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique64 ?
Unique (%)32.2%

Sample

1st rowRoadside in coniferous forest
2nd rowOn a figleaf gourd
3rd rowcultivated garden
4th rowhammocks-dense hardwood & Palmetto forests
5th rowvisiting mango flowers
ValueCountFrequency (%)
garden 45
 
7.4%
cultivated 44
 
7.3%
stream 26
 
4.3%
on 26
 
4.3%
forest 23
 
3.8%
in 19
 
3.1%
of 13
 
2.1%
collected 12
 
2.0%
at 9
 
1.5%
terre 8
 
1.3%
Other values (183) 381
62.9%
2025-01-14T11:40:24.788546image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
407
 
10.6%
e 388
 
10.1%
a 308
 
8.0%
r 258
 
6.7%
t 250
 
6.5%
d 224
 
5.8%
n 223
 
5.8%
o 217
 
5.7%
i 190
 
5.0%
l 185
 
4.8%
Other values (52) 1188
31.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3215
83.8%
Space Separator 407
 
10.6%
Uppercase Letter 126
 
3.3%
Other Punctuation 51
 
1.3%
Decimal Number 27
 
0.7%
Dash Punctuation 6
 
0.2%
Close Punctuation 3
 
0.1%
Open Punctuation 3
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 388
12.1%
a 308
 
9.6%
r 258
 
8.0%
t 250
 
7.8%
d 224
 
7.0%
n 223
 
6.9%
o 217
 
6.7%
i 190
 
5.9%
l 185
 
5.8%
s 175
 
5.4%
Other values (15) 797
24.8%
Uppercase Letter
ValueCountFrequency (%)
S 28
22.2%
C 24
19.0%
R 9
 
7.1%
O 9
 
7.1%
P 8
 
6.3%
T 7
 
5.6%
I 6
 
4.8%
W 5
 
4.0%
F 5
 
4.0%
E 4
 
3.2%
Other values (10) 21
16.7%
Decimal Number
ValueCountFrequency (%)
0 8
29.6%
2 6
22.2%
1 5
18.5%
3 4
14.8%
8 2
 
7.4%
5 1
 
3.7%
7 1
 
3.7%
Other Punctuation
ValueCountFrequency (%)
, 19
37.3%
. 16
31.4%
" 6
 
11.8%
: 5
 
9.8%
& 3
 
5.9%
/ 2
 
3.9%
Space Separator
ValueCountFrequency (%)
407
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3341
87.1%
Common 497
 
12.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 388
11.6%
a 308
 
9.2%
r 258
 
7.7%
t 250
 
7.5%
d 224
 
6.7%
n 223
 
6.7%
o 217
 
6.5%
i 190
 
5.7%
l 185
 
5.5%
s 175
 
5.2%
Other values (35) 923
27.6%
Common
ValueCountFrequency (%)
407
81.9%
, 19
 
3.8%
. 16
 
3.2%
0 8
 
1.6%
" 6
 
1.2%
2 6
 
1.2%
- 6
 
1.2%
1 5
 
1.0%
: 5
 
1.0%
3 4
 
0.8%
Other values (7) 15
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
407
 
10.6%
e 388
 
10.1%
a 308
 
8.0%
r 258
 
6.7%
t 250
 
6.5%
d 224
 
5.8%
n 223
 
5.8%
o 217
 
5.7%
i 190
 
5.0%
l 185
 
4.8%
Other values (52) 1188
31.0%

locationID
Text

Missing 

Distinct185
Distinct (%)17.7%
Missing603675
Missing (%)99.8%
Memory size4.6 MiB
2025-01-14T11:40:24.976289image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length40
Median length14
Mean length10.78947368
Min length1

Characters and Unicode

Total characters11275
Distinct characters56
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique94 ?
Unique (%)9.0%

Sample

1st rowMEI Site 97-81
2nd rowRD-044
3rd rowMEI Site 97-81
4th rowMEI Site 97-81
5th rowMEI Site 97-81
ValueCountFrequency (%)
mei 652
27.5%
site 610
25.7%
97-81 301
12.7%
97-92 132
 
5.6%
97-90 52
 
2.2%
97-58 46
 
1.9%
97-74 31
 
1.3%
97-88 26
 
1.1%
97-93 24
 
1.0%
k-m1 19
 
0.8%
Other values (195) 479
20.2%
2025-01-14T11:40:25.386866image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1327
 
11.8%
- 986
 
8.7%
9 904
 
8.0%
7 770
 
6.8%
M 698
 
6.2%
I 659
 
5.8%
E 656
 
5.8%
t 638
 
5.7%
e 637
 
5.6%
i 624
 
5.5%
Other values (46) 3376
29.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3620
32.1%
Uppercase Letter 3287
29.2%
Lowercase Letter 2029
18.0%
Space Separator 1327
 
11.8%
Dash Punctuation 986
 
8.7%
Other Punctuation 26
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 698
21.2%
I 659
20.0%
E 656
20.0%
S 609
18.5%
R 278
 
8.5%
D 272
 
8.3%
K 20
 
0.6%
J 14
 
0.4%
N 11
 
0.3%
L 11
 
0.3%
Other values (11) 59
 
1.8%
Lowercase Letter
ValueCountFrequency (%)
t 638
31.4%
e 637
31.4%
i 624
30.8%
l 27
 
1.3%
a 20
 
1.0%
s 20
 
1.0%
r 10
 
0.5%
o 8
 
0.4%
n 7
 
0.3%
p 7
 
0.3%
Other values (9) 31
 
1.5%
Decimal Number
ValueCountFrequency (%)
9 904
25.0%
7 770
21.3%
1 571
15.8%
8 458
12.7%
2 322
 
8.9%
0 184
 
5.1%
5 143
 
4.0%
4 95
 
2.6%
6 87
 
2.4%
3 86
 
2.4%
Other Punctuation
ValueCountFrequency (%)
# 19
73.1%
, 5
 
19.2%
. 1
 
3.8%
: 1
 
3.8%
Space Separator
ValueCountFrequency (%)
1327
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 986
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5959
52.9%
Latin 5316
47.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 698
13.1%
I 659
12.4%
E 656
12.3%
t 638
12.0%
e 637
12.0%
i 624
11.7%
S 609
11.5%
R 278
 
5.2%
D 272
 
5.1%
l 27
 
0.5%
Other values (30) 218
 
4.1%
Common
ValueCountFrequency (%)
1327
22.3%
- 986
16.5%
9 904
15.2%
7 770
12.9%
1 571
9.6%
8 458
 
7.7%
2 322
 
5.4%
0 184
 
3.1%
5 143
 
2.4%
4 95
 
1.6%
Other values (6) 199
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11275
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1327
 
11.8%
- 986
 
8.7%
9 904
 
8.0%
7 770
 
6.8%
M 698
 
6.2%
I 659
 
5.8%
E 656
 
5.8%
t 638
 
5.7%
e 637
 
5.6%
i 624
 
5.5%
Other values (46) 3376
29.9%

higherGeography
Text

Missing 

Distinct10596
Distinct (%)2.4%
Missing156093
Missing (%)25.8%
Memory size4.6 MiB
2025-01-14T11:40:25.581089image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length101
Median length91
Mean length30.38929222
Min length4

Characters and Unicode

Total characters13633457
Distinct characters132
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3142 ?
Unique (%)0.7%

Sample

1st rowUnited States, [Not Stated], [Not Stated]
2nd rowCosta Rica, Cartago, [Not Stated]
3rd rowUnited States, Alaska, Aleutians West
4th rowUnited States, Virginia, Virginia Beach
5th rowUnited States, New York, [Not Stated]
ValueCountFrequency (%)
united 222849
 
12.1%
states 221117
 
12.1%
not 168021
 
9.2%
stated 168019
 
9.2%
california 23411
 
1.3%
virginia 23321
 
1.3%
new 22503
 
1.2%
colorado 21080
 
1.1%
mexico 21004
 
1.1%
canada 16233
 
0.9%
Other values (6796) 927210
50.5%
2025-01-14T11:40:25.857322image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1386867
 
10.2%
t 1386839
 
10.2%
1386141
 
10.2%
e 1091011
 
8.0%
i 816099
 
6.0%
n 814243
 
6.0%
, 798935
 
5.9%
o 692570
 
5.1%
d 580440
 
4.3%
s 501693
 
3.7%
Other values (122) 4178619
30.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9268484
68.0%
Uppercase Letter 1826618
 
13.4%
Space Separator 1386141
 
10.2%
Other Punctuation 805778
 
5.9%
Open Punctuation 168048
 
1.2%
Close Punctuation 167999
 
1.2%
Dash Punctuation 10310
 
0.1%
Decimal Number 75
 
< 0.1%
Modifier Letter 2
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1386867
15.0%
t 1386839
15.0%
e 1091011
11.8%
i 816099
8.8%
n 814243
8.8%
o 692570
7.5%
d 580440
6.3%
s 501693
 
5.4%
r 454316
 
4.9%
l 313893
 
3.4%
Other values (59) 1230513
13.3%
Uppercase Letter
ValueCountFrequency (%)
S 462311
25.3%
U 242099
13.3%
N 220752
12.1%
C 174694
 
9.6%
M 92438
 
5.1%
P 64247
 
3.5%
B 57602
 
3.2%
A 54181
 
3.0%
T 52091
 
2.9%
I 45082
 
2.5%
Other values (27) 361121
19.8%
Other Punctuation
ValueCountFrequency (%)
, 798935
99.2%
' 3984
 
0.5%
. 2433
 
0.3%
/ 183
 
< 0.1%
? 152
 
< 0.1%
& 50
 
< 0.1%
: 39
 
< 0.1%
; 1
 
< 0.1%
¡ 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
3 46
61.3%
9 14
 
18.7%
4 11
 
14.7%
2 2
 
2.7%
8 1
 
1.3%
1 1
 
1.3%
Dash Punctuation
ValueCountFrequency (%)
- 10286
99.8%
22
 
0.2%
2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 168014
> 99.9%
( 34
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 167965
> 99.9%
) 34
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1386141
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 2
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%
Control
ValueCountFrequency (%)
 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11095102
81.4%
Common 2538355
 
18.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1386867
12.5%
t 1386839
12.5%
e 1091011
 
9.8%
i 816099
 
7.4%
n 814243
 
7.3%
o 692570
 
6.2%
d 580440
 
5.2%
s 501693
 
4.5%
S 462311
 
4.2%
r 454316
 
4.1%
Other values (96) 2908713
26.2%
Common
ValueCountFrequency (%)
1386141
54.6%
, 798935
31.5%
[ 168014
 
6.6%
] 167965
 
6.6%
- 10286
 
0.4%
' 3984
 
0.2%
. 2433
 
0.1%
/ 183
 
< 0.1%
? 152
 
< 0.1%
& 50
 
< 0.1%
Other values (16) 212
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13627164
> 99.9%
None 6245
 
< 0.1%
Punctuation 24
 
< 0.1%
Latin Ext Additional 22
 
< 0.1%
Modifier Letters 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1386867
 
10.2%
t 1386839
 
10.2%
1386141
 
10.2%
e 1091011
 
8.0%
i 816099
 
6.0%
n 814243
 
6.0%
, 798935
 
5.9%
o 692570
 
5.1%
d 580440
 
4.3%
s 501693
 
3.7%
Other values (63) 4172326
30.6%
None
ValueCountFrequency (%)
á 1227
19.6%
ü 1114
17.8%
í 1027
16.4%
ó 731
11.7%
é 700
11.2%
ã 292
 
4.7%
ô 268
 
4.3%
ø 167
 
2.7%
è 135
 
2.2%
ä 68
 
1.1%
Other values (45) 516
8.3%
Latin Ext Additional
ValueCountFrequency (%)
22
100.0%
Punctuation
ValueCountFrequency (%)
22
91.7%
2
 
8.3%
Modifier Letters
ValueCountFrequency (%)
ʻ 2
100.0%

continent
Text

Missing 

Distinct6
Distinct (%)4.7%
Missing604592
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:25.919056image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length4
Mean length7.15625
Min length4

Characters and Unicode

Total characters916
Distinct characters21
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st rowSouth America
2nd rowAsia
3rd rowSouth America
4th rowEurope
5th rowAsia
ValueCountFrequency (%)
asia 69
40.8%
america 40
23.7%
north 21
 
12.4%
south 19
 
11.2%
europe 9
 
5.3%
africa 9
 
5.3%
not 1
 
0.6%
stated 1
 
0.6%
2025-01-14T11:40:26.020870image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 119
13.0%
A 118
12.9%
i 118
12.9%
r 79
8.6%
s 69
 
7.5%
o 50
 
5.5%
e 50
 
5.5%
c 49
 
5.3%
t 43
 
4.7%
41
 
4.5%
Other values (11) 180
19.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 704
76.9%
Uppercase Letter 169
 
18.4%
Space Separator 41
 
4.5%
Open Punctuation 1
 
0.1%
Close Punctuation 1
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 119
16.9%
i 118
16.8%
r 79
11.2%
s 69
9.8%
o 50
7.1%
e 50
7.1%
c 49
7.0%
t 43
 
6.1%
m 40
 
5.7%
h 40
 
5.7%
Other values (4) 47
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
A 118
69.8%
N 22
 
13.0%
S 20
 
11.8%
E 9
 
5.3%
Space Separator
ValueCountFrequency (%)
41
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 1
100.0%
Close Punctuation
ValueCountFrequency (%)
] 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 873
95.3%
Common 43
 
4.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 119
13.6%
A 118
13.5%
i 118
13.5%
r 79
9.0%
s 69
7.9%
o 50
 
5.7%
e 50
 
5.7%
c 49
 
5.6%
t 43
 
4.9%
m 40
 
4.6%
Other values (8) 138
15.8%
Common
ValueCountFrequency (%)
41
95.3%
[ 1
 
2.3%
] 1
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 916
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 119
13.0%
A 118
12.9%
i 118
12.9%
r 79
8.6%
s 69
 
7.5%
o 50
 
5.5%
e 50
 
5.5%
c 49
 
5.3%
t 43
 
4.7%
41
 
4.5%
Other values (11) 180
19.7%

waterBody
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604719
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:26.068219image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowDeMarmels
ValueCountFrequency (%)
demarmels 1
100.0%
2025-01-14T11:40:26.166264image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2
22.2%
D 1
11.1%
M 1
11.1%
a 1
11.1%
r 1
11.1%
m 1
11.1%
l 1
11.1%
s 1
11.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
77.8%
Uppercase Letter 2
 
22.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2
28.6%
a 1
14.3%
r 1
14.3%
m 1
14.3%
l 1
14.3%
s 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
D 1
50.0%
M 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2
22.2%
D 1
11.1%
M 1
11.1%
a 1
11.1%
r 1
11.1%
m 1
11.1%
l 1
11.1%
s 1
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2
22.2%
D 1
11.1%
M 1
11.1%
a 1
11.1%
r 1
11.1%
m 1
11.1%
l 1
11.1%
s 1
11.1%

islandGroup
Text

Missing 

Distinct72
Distinct (%)2.9%
Missing602200
Missing (%)99.6%
Memory size4.6 MiB
2025-01-14T11:40:26.245484image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length13
Mean length13.7202381
Min length5

Characters and Unicode

Total characters34575
Distinct characters49
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)0.8%

Sample

1st rowSunda Islands
2nd rowInner Islands
3rd rowViti Levu Group
4th rowChuuk Lagoon
5th rowSunda Islands
ValueCountFrequency (%)
islands 2160
42.2%
sunda 956
18.7%
marquesas 249
 
4.9%
solomon 226
 
4.4%
bass 171
 
3.3%
chuuk 149
 
2.9%
lagoon 149
 
2.9%
outer 149
 
2.9%
inner 140
 
2.7%
group 100
 
2.0%
Other values (78) 673
 
13.1%
2025-01-14T11:40:26.402739image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 5365
15.5%
a 4395
12.7%
n 3948
11.4%
d 3266
9.4%
2602
7.5%
l 2568
7.4%
I 2313
6.7%
u 1953
 
5.6%
S 1250
 
3.6%
o 1226
 
3.5%
Other values (39) 5689
16.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26832
77.6%
Uppercase Letter 5122
 
14.8%
Space Separator 2602
 
7.5%
Other Punctuation 19
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 5365
20.0%
a 4395
16.4%
n 3948
14.7%
d 3266
12.2%
l 2568
9.6%
u 1953
 
7.3%
o 1226
 
4.6%
r 905
 
3.4%
e 893
 
3.3%
i 343
 
1.3%
Other values (14) 1970
 
7.3%
Uppercase Letter
ValueCountFrequency (%)
I 2313
45.2%
S 1250
24.4%
M 256
 
5.0%
L 237
 
4.6%
C 200
 
3.9%
B 171
 
3.3%
O 158
 
3.1%
G 147
 
2.9%
V 87
 
1.7%
N 75
 
1.5%
Other values (12) 228
 
4.5%
Other Punctuation
ValueCountFrequency (%)
' 10
52.6%
. 9
47.4%
Space Separator
ValueCountFrequency (%)
2602
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 31954
92.4%
Common 2621
 
7.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 5365
16.8%
a 4395
13.8%
n 3948
12.4%
d 3266
10.2%
l 2568
8.0%
I 2313
7.2%
u 1953
 
6.1%
S 1250
 
3.9%
o 1226
 
3.8%
r 905
 
2.8%
Other values (36) 4765
14.9%
Common
ValueCountFrequency (%)
2602
99.3%
' 10
 
0.4%
. 9
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 34575
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 5365
15.5%
a 4395
12.7%
n 3948
11.4%
d 3266
9.4%
2602
7.5%
l 2568
7.4%
I 2313
6.7%
u 1953
 
5.6%
S 1250
 
3.6%
o 1226
 
3.5%
Other values (39) 5689
16.5%

island
Text

Missing 

Distinct436
Distinct (%)4.7%
Missing595353
Missing (%)98.5%
Memory size4.6 MiB
2025-01-14T11:40:26.591849image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length21
Mean length9.324436853
Min length3

Characters and Unicode

Total characters87342
Distinct characters62
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique168 ?
Unique (%)1.8%

Sample

1st rowSouth Island
2nd rowPohnpei
3rd rowSouth Island
4th rowOahu
5th rowGuadalcanal
ValueCountFrequency (%)
island 3167
21.5%
south 1636
 
11.1%
java 884
 
6.0%
levu 565
 
3.8%
viti 541
 
3.7%
north 519
 
3.5%
guadalcanal 327
 
2.2%
borneo 253
 
1.7%
hiva 247
 
1.7%
key 246
 
1.7%
Other values (438) 6372
43.2%
2025-01-14T11:40:26.851649image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 12933
14.8%
n 6143
 
7.0%
l 5485
 
6.3%
o 5446
 
6.2%
5390
 
6.2%
u 4466
 
5.1%
d 4450
 
5.1%
s 4126
 
4.7%
e 3908
 
4.5%
t 3745
 
4.3%
Other values (52) 31250
35.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 66998
76.7%
Uppercase Letter 14740
 
16.9%
Space Separator 5390
 
6.2%
Other Punctuation 169
 
0.2%
Dash Punctuation 18
 
< 0.1%
Open Punctuation 13
 
< 0.1%
Close Punctuation 13
 
< 0.1%
Modifier Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 12933
19.3%
n 6143
9.2%
l 5485
8.2%
o 5446
8.1%
u 4466
 
6.7%
d 4450
 
6.6%
s 4126
 
6.2%
e 3908
 
5.8%
t 3745
 
5.6%
i 3651
 
5.4%
Other values (19) 12645
18.9%
Uppercase Letter
ValueCountFrequency (%)
I 3295
22.4%
S 2358
16.0%
N 1067
 
7.2%
J 892
 
6.1%
L 820
 
5.6%
B 722
 
4.9%
V 681
 
4.6%
G 648
 
4.4%
M 648
 
4.4%
H 619
 
4.2%
Other values (14) 2990
20.3%
Other Punctuation
ValueCountFrequency (%)
' 164
97.0%
. 5
 
3.0%
Open Punctuation
ValueCountFrequency (%)
( 12
92.3%
[ 1
 
7.7%
Close Punctuation
ValueCountFrequency (%)
) 12
92.3%
] 1
 
7.7%
Space Separator
ValueCountFrequency (%)
5390
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 81738
93.6%
Common 5604
 
6.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 12933
15.8%
n 6143
 
7.5%
l 5485
 
6.7%
o 5446
 
6.7%
u 4466
 
5.5%
d 4450
 
5.4%
s 4126
 
5.0%
e 3908
 
4.8%
t 3745
 
4.6%
i 3651
 
4.5%
Other values (43) 27385
33.5%
Common
ValueCountFrequency (%)
5390
96.2%
' 164
 
2.9%
- 18
 
0.3%
( 12
 
0.2%
) 12
 
0.2%
. 5
 
0.1%
ʻ 1
 
< 0.1%
[ 1
 
< 0.1%
] 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 87316
> 99.9%
None 25
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 12933
14.8%
n 6143
 
7.0%
l 5485
 
6.3%
o 5446
 
6.2%
5390
 
6.2%
u 4466
 
5.1%
d 4450
 
5.1%
s 4126
 
4.7%
e 3908
 
4.5%
t 3745
 
4.3%
Other values (47) 31224
35.8%
None
ValueCountFrequency (%)
ñ 13
52.0%
ó 7
28.0%
é 4
 
16.0%
Ž 1
 
4.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 1
100.0%

country
Text

Missing 

Distinct361
Distinct (%)0.1%
Missing156115
Missing (%)25.8%
Memory size4.6 MiB
2025-01-14T11:40:27.056157image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length57
Median length44
Mean length10.35667681
Min length4

Characters and Unicode

Total characters4646057
Distinct characters66
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique74 ?
Unique (%)< 0.1%

Sample

1st rowUnited States
2nd rowCosta Rica
3rd rowUnited States
4th rowUnited States
5th rowUnited States
ValueCountFrequency (%)
united 222629
30.9%
states 220899
30.7%
canada 16232
 
2.3%
mexico 15811
 
2.2%
china 14526
 
2.0%
brazil 12973
 
1.8%
costa 8910
 
1.2%
rica 8910
 
1.2%
peru 7637
 
1.1%
india 7029
 
1.0%
Other values (376) 184674
25.6%
2025-01-14T11:40:27.322479image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 718259
15.5%
e 560772
12.1%
a 528526
11.4%
i 389761
8.4%
n 365385
7.9%
d 287382
 
6.2%
271625
 
5.8%
s 261111
 
5.6%
S 244256
 
5.3%
U 223931
 
4.8%
Other values (56) 795049
17.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3644367
78.4%
Uppercase Letter 715786
 
15.4%
Space Separator 271625
 
5.8%
Close Punctuation 6527
 
0.1%
Open Punctuation 6527
 
0.1%
Other Punctuation 1214
 
< 0.1%
Dash Punctuation 11
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 718259
19.7%
e 560772
15.4%
a 528526
14.5%
i 389761
10.7%
n 365385
10.0%
d 287382
7.9%
s 261111
 
7.2%
o 84202
 
2.3%
r 69612
 
1.9%
l 68426
 
1.9%
Other values (18) 310931
8.5%
Uppercase Letter
ValueCountFrequency (%)
S 244256
34.1%
U 223931
31.3%
C 53206
 
7.4%
P 26355
 
3.7%
M 23603
 
3.3%
B 19253
 
2.7%
I 15375
 
2.1%
N 14724
 
2.1%
R 12978
 
1.8%
G 12741
 
1.8%
Other values (15) 69364
 
9.7%
Other Punctuation
ValueCountFrequency (%)
, 945
77.8%
. 111
 
9.1%
' 105
 
8.6%
: 36
 
3.0%
? 10
 
0.8%
/ 6
 
0.5%
; 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
] 6524
> 99.9%
) 3
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 6524
> 99.9%
( 3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
271625
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4360153
93.8%
Common 285904
 
6.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 718259
16.5%
e 560772
12.9%
a 528526
12.1%
i 389761
8.9%
n 365385
8.4%
d 287382
6.6%
s 261111
 
6.0%
S 244256
 
5.6%
U 223931
 
5.1%
o 84202
 
1.9%
Other values (43) 696568
16.0%
Common
ValueCountFrequency (%)
271625
95.0%
] 6524
 
2.3%
[ 6524
 
2.3%
, 945
 
0.3%
. 111
 
< 0.1%
' 105
 
< 0.1%
: 36
 
< 0.1%
- 11
 
< 0.1%
? 10
 
< 0.1%
/ 6
 
< 0.1%
Other values (3) 7
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4645987
> 99.9%
None 70
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 718259
15.5%
e 560772
12.1%
a 528526
11.4%
i 389761
8.4%
n 365385
7.9%
d 287382
 
6.2%
271625
 
5.8%
s 261111
 
5.6%
S 244256
 
5.3%
U 223931
 
4.8%
Other values (54) 794979
17.1%
None
ValueCountFrequency (%)
ô 69
98.6%
ç 1
 
1.4%

stateProvince
Text

Missing 

Distinct3068
Distinct (%)0.7%
Missing173239
Missing (%)28.6%
Memory size4.6 MiB
2025-01-14T11:40:27.529238image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length57
Median length44
Mean length9.044942883
Min length2

Characters and Unicode

Total characters3902721
Distinct characters116
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique808 ?
Unique (%)0.2%

Sample

1st row[Not Stated]
2nd rowCartago
3rd rowAlaska
4th rowVirginia
5th rowNew York
ValueCountFrequency (%)
not 29440
 
5.2%
stated 29440
 
5.2%
california 23322
 
4.1%
virginia 22013
 
3.9%
colorado 20952
 
3.7%
new 16651
 
2.9%
texas 12341
 
2.2%
arizona 12146
 
2.1%
florida 9884
 
1.7%
maryland 9608
 
1.7%
Other values (2915) 379877
67.2%
2025-01-14T11:40:28.058519image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 524357
 
13.4%
o 333196
 
8.5%
i 321786
 
8.2%
n 299093
 
7.7%
r 250082
 
6.4%
e 216703
 
5.6%
t 208658
 
5.3%
s 151919
 
3.9%
l 138292
 
3.5%
134193
 
3.4%
Other values (106) 1324442
33.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3135689
80.3%
Uppercase Letter 563884
 
14.4%
Space Separator 134193
 
3.4%
Open Punctuation 29409
 
0.8%
Close Punctuation 29400
 
0.8%
Dash Punctuation 8111
 
0.2%
Other Punctuation 1958
 
0.1%
Decimal Number 75
 
< 0.1%
Control 1
 
< 0.1%
Modifier Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 524357
16.7%
o 333196
10.6%
i 321786
10.3%
n 299093
9.5%
r 250082
8.0%
e 216703
 
6.9%
t 208658
 
6.7%
s 151919
 
4.8%
l 138292
 
4.4%
d 113012
 
3.6%
Other values (49) 578591
18.5%
Uppercase Letter
ValueCountFrequency (%)
C 79656
14.1%
N 67159
11.9%
S 61382
10.9%
M 46174
 
8.2%
T 31302
 
5.6%
A 30288
 
5.4%
V 29057
 
5.2%
W 27116
 
4.8%
P 20337
 
3.6%
I 18302
 
3.2%
Other values (25) 153111
27.2%
Other Punctuation
ValueCountFrequency (%)
. 987
50.4%
' 638
32.6%
? 138
 
7.0%
/ 121
 
6.2%
, 70
 
3.6%
: 3
 
0.2%
¡ 1
 
0.1%
Decimal Number
ValueCountFrequency (%)
3 46
61.3%
9 14
 
18.7%
4 11
 
14.7%
2 2
 
2.7%
8 1
 
1.3%
1 1
 
1.3%
Open Punctuation
ValueCountFrequency (%)
[ 29408
> 99.9%
( 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 29399
> 99.9%
) 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 8089
99.7%
22
 
0.3%
Space Separator
ValueCountFrequency (%)
134193
100.0%
Control
ValueCountFrequency (%)
 1
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3699573
94.8%
Common 203148
 
5.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 524357
14.2%
o 333196
 
9.0%
i 321786
 
8.7%
n 299093
 
8.1%
r 250082
 
6.8%
e 216703
 
5.9%
t 208658
 
5.6%
s 151919
 
4.1%
l 138292
 
3.7%
d 113012
 
3.1%
Other values (84) 1142475
30.9%
Common
ValueCountFrequency (%)
134193
66.1%
[ 29408
 
14.5%
] 29399
 
14.5%
- 8089
 
4.0%
. 987
 
0.5%
' 638
 
0.3%
? 138
 
0.1%
/ 121
 
0.1%
, 70
 
< 0.1%
3 46
 
< 0.1%
Other values (12) 59
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3897577
99.9%
None 5099
 
0.1%
Latin Ext Additional 22
 
< 0.1%
Punctuation 22
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 524357
 
13.5%
o 333196
 
8.5%
i 321786
 
8.3%
n 299093
 
7.7%
r 250082
 
6.4%
e 216703
 
5.6%
t 208658
 
5.4%
s 151919
 
3.9%
l 138292
 
3.5%
134193
 
3.4%
Other values (60) 1319298
33.8%
None
ValueCountFrequency (%)
á 1200
23.5%
ü 991
19.4%
í 928
18.2%
ó 488
9.6%
é 410
 
8.0%
ã 292
 
5.7%
ø 158
 
3.1%
ô 125
 
2.5%
è 117
 
2.3%
ä 54
 
1.1%
Other values (33) 336
 
6.6%
Latin Ext Additional
ValueCountFrequency (%)
22
100.0%
Punctuation
ValueCountFrequency (%)
22
100.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 1
100.0%

county
Text

Missing 

Distinct4068
Distinct (%)1.2%
Missing254867
Missing (%)42.1%
Memory size4.6 MiB
2025-01-14T11:40:28.270956image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length51
Median length45
Mean length9.456280209
Min length1

Characters and Unicode

Total characters3308308
Distinct characters98
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1157 ?
Unique (%)0.3%

Sample

1st row[Not Stated]
2nd row[Not Stated]
3rd rowAleutians West
4th rowVirginia Beach
5th row[Not Stated]
ValueCountFrequency (%)
not 132062
25.3%
stated 132060
25.3%
boulder 6789
 
1.3%
creek 6760
 
1.3%
clear 6751
 
1.3%
san 5405
 
1.0%
montgomery 4939
 
0.9%
cochise 4320
 
0.8%
prince 3492
 
0.7%
tuolumne 3206
 
0.6%
Other values (4079) 215282
41.3%
2025-01-14T11:40:28.549546image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 455467
13.8%
a 309900
 
9.4%
e 305731
 
9.2%
o 264738
 
8.0%
171213
 
5.2%
d 169224
 
5.1%
S 152130
 
4.6%
N 137690
 
4.2%
n 133853
 
4.0%
[ 132080
 
4.0%
Other values (88) 1076282
32.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2347003
70.9%
Uppercase Letter 519166
 
15.7%
Space Separator 171213
 
5.2%
Open Punctuation 132098
 
4.0%
Close Punctuation 132058
 
4.0%
Other Punctuation 4601
 
0.1%
Dash Punctuation 2168
 
0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 455467
19.4%
a 309900
13.2%
e 305731
13.0%
o 264738
11.3%
d 169224
 
7.2%
n 133853
 
5.7%
r 128735
 
5.5%
i 96642
 
4.1%
l 92695
 
3.9%
s 72035
 
3.1%
Other values (42) 317983
13.5%
Uppercase Letter
ValueCountFrequency (%)
S 152130
29.3%
N 137690
26.5%
C 39707
 
7.6%
B 24571
 
4.7%
M 21494
 
4.1%
P 16767
 
3.2%
W 13595
 
2.6%
L 12294
 
2.4%
G 12068
 
2.3%
T 10765
 
2.1%
Other values (23) 78085
15.0%
Other Punctuation
ValueCountFrequency (%)
' 3065
66.6%
. 1321
28.7%
, 105
 
2.3%
/ 56
 
1.2%
& 50
 
1.1%
? 4
 
0.1%
Open Punctuation
ValueCountFrequency (%)
[ 132080
> 99.9%
( 18
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 132040
> 99.9%
) 18
 
< 0.1%
Space Separator
ValueCountFrequency (%)
171213
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2168
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2866169
86.6%
Common 442139
 
13.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 455467
15.9%
a 309900
10.8%
e 305731
10.7%
o 264738
 
9.2%
d 169224
 
5.9%
S 152130
 
5.3%
N 137690
 
4.8%
n 133853
 
4.7%
r 128735
 
4.5%
i 96642
 
3.4%
Other values (75) 712059
24.8%
Common
ValueCountFrequency (%)
171213
38.7%
[ 132080
29.9%
] 132040
29.9%
' 3065
 
0.7%
- 2168
 
0.5%
. 1321
 
0.3%
, 105
 
< 0.1%
/ 56
 
< 0.1%
& 50
 
< 0.1%
( 18
 
< 0.1%
Other values (3) 23
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3307261
> 99.9%
None 1047
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 455467
13.8%
a 309900
 
9.4%
e 305731
 
9.2%
o 264738
 
8.0%
171213
 
5.2%
d 169224
 
5.1%
S 152130
 
4.6%
N 137690
 
4.2%
n 133853
 
4.0%
[ 132080
 
4.0%
Other values (55) 1075235
32.5%
None
ValueCountFrequency (%)
é 285
27.2%
ó 235
22.4%
ü 123
11.7%
í 99
 
9.5%
ô 74
 
7.1%
Ñ 29
 
2.8%
á 27
 
2.6%
è 18
 
1.7%
ś 16
 
1.5%
ć 15
 
1.4%
Other values (23) 126
12.0%

locality
Text

Missing 

Distinct76621
Distinct (%)17.2%
Missing158363
Missing (%)26.2%
Memory size4.6 MiB
2025-01-14T11:40:28.767490image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length550043
Median length182
Mean length24.13015367
Min length1

Characters and Unicode

Total characters10770663
Distinct characters148
Distinct categories16 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44463 ?
Unique (%)10.0%

Sample

1st row[Not Stated]
2nd rowRio Aquiares, Turrialba
3rd rowSaint Paul Island, Bering Sea
4th rowFalse Cape State Park, Wash Woods, 100 meters east of Interpreter's residence
5th row[Not Stated]
ValueCountFrequency (%)
not 66601
 
4.1%
stated 66524
 
4.1%
of 42103
 
2.6%
miles 21225
 
1.3%
kilometers 15789
 
1.0%
park 15479
 
1.0%
river 15374
 
1.0%
lake 14864
 
0.9%
near 12865
 
0.8%
creek 12692
 
0.8%
Other values (59148) 1327951
82.4%
2025-01-14T11:40:29.068455image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1113923
 
10.3%
a 970976
 
9.0%
e 784777
 
7.3%
o 677654
 
6.3%
t 644226
 
6.0%
n 525770
 
4.9%
i 505577
 
4.7%
r 496321
 
4.6%
l 397845
 
3.7%
s 367559
 
3.4%
Other values (138) 4286035
39.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7196462
66.8%
Uppercase Letter 1266334
 
11.8%
Space Separator 1113923
 
10.3%
Decimal Number 367182
 
3.4%
Other Punctuation 344983
 
3.2%
Control 288676
 
2.7%
Open Punctuation 78963
 
0.7%
Close Punctuation 78951
 
0.7%
Dash Punctuation 33528
 
0.3%
Math Symbol 1306
 
< 0.1%
Other values (6) 355
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 970976
13.5%
e 784777
10.9%
o 677654
9.4%
t 644226
9.0%
n 525770
 
7.3%
i 505577
 
7.0%
r 496321
 
6.9%
l 397845
 
5.5%
s 367559
 
5.1%
u 258028
 
3.6%
Other values (48) 1567729
21.8%
Uppercase Letter
ValueCountFrequency (%)
S 174256
13.8%
N 124539
 
9.8%
C 121685
 
9.6%
P 93135
 
7.4%
R 84162
 
6.6%
M 83089
 
6.6%
B 66969
 
5.3%
L 58035
 
4.6%
A 52250
 
4.1%
F 47325
 
3.7%
Other values (30) 360889
28.5%
Other Punctuation
ValueCountFrequency (%)
, 157009
45.5%
. 82199
23.8%
; 57201
 
16.6%
: 21295
 
6.2%
/ 12921
 
3.7%
' 9409
 
2.7%
? 1932
 
0.6%
" 1379
 
0.4%
& 936
 
0.3%
# 665
 
0.2%
Other values (5) 37
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 65034
17.7%
1 61769
16.8%
2 43526
11.9%
3 35033
9.5%
5 34550
9.4%
6 29303
8.0%
4 27484
7.5%
8 23936
 
6.5%
9 23798
 
6.5%
7 22749
 
6.2%
Math Symbol
ValueCountFrequency (%)
= 649
49.7%
+ 302
23.1%
~ 250
 
19.1%
| 102
 
7.8%
< 2
 
0.2%
> 1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
[ 70524
89.3%
( 8335
 
10.6%
{ 103
 
0.1%
1
 
< 0.1%
Control
ValueCountFrequency (%)
287154
99.5%
1520
 
0.5%
 2
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 70480
89.3%
) 8328
 
10.5%
} 143
 
0.2%
Modifier Symbol
ValueCountFrequency (%)
´ 3
60.0%
¯ 2
40.0%
Space Separator
ValueCountFrequency (%)
1113923
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 33528
100.0%
Other Symbol
ValueCountFrequency (%)
° 134
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 134
100.0%
Currency Symbol
ValueCountFrequency (%)
¢ 50
100.0%
Final Punctuation
ValueCountFrequency (%)
26
100.0%
Initial Punctuation
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8462796
78.6%
Common 2307867
 
21.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 970976
 
11.5%
e 784777
 
9.3%
o 677654
 
8.0%
t 644226
 
7.6%
n 525770
 
6.2%
i 505577
 
6.0%
r 496321
 
5.9%
l 397845
 
4.7%
s 367559
 
4.3%
u 258028
 
3.0%
Other values (88) 2834063
33.5%
Common
ValueCountFrequency (%)
1113923
48.3%
287154
 
12.4%
, 157009
 
6.8%
. 82199
 
3.6%
[ 70524
 
3.1%
] 70480
 
3.1%
0 65034
 
2.8%
1 61769
 
2.7%
; 57201
 
2.5%
2 43526
 
1.9%
Other values (40) 299048
 
13.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10768129
> 99.9%
None 2500
 
< 0.1%
Punctuation 34
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1113923
 
10.3%
a 970976
 
9.0%
e 784777
 
7.3%
o 677654
 
6.3%
t 644226
 
6.0%
n 525770
 
4.9%
i 505577
 
4.7%
r 496321
 
4.6%
l 397845
 
3.7%
s 367559
 
3.4%
Other values (82) 4283501
39.8%
None
ValueCountFrequency (%)
ñ 374
15.0%
ó 346
13.8%
á 338
13.5%
é 321
12.8%
ã 219
8.8%
ü 178
7.1%
í 138
 
5.5%
° 134
 
5.4%
ç 115
 
4.6%
¢ 50
 
2.0%
Other values (42) 287
11.5%
Punctuation
ValueCountFrequency (%)
26
76.5%
6
 
17.6%
1
 
2.9%
1
 
2.9%
Distinct1812
Distinct (%)3.9%
Missing558058
Missing (%)92.3%
Memory size4.6 MiB
2025-01-14T11:40:29.281823image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length7
Mean length5.369958424
Min length3

Characters and Unicode

Total characters250573
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique454 ?
Unique (%)1.0%

Sample

1st row2040.0
2nd row240.0
3rd row165.0
4th row400.0
5th row1300.0
ValueCountFrequency (%)
2743.0 1183
 
2.5%
3353.0 909
 
1.9%
1829.0 812
 
1.7%
610.0 652
 
1.4%
1524.0 627
 
1.3%
914.0 612
 
1.3%
427.0 567
 
1.2%
1100.0 562
 
1.2%
200.0 531
 
1.1%
1372.0 519
 
1.1%
Other values (1798) 39688
85.1%
2025-01-14T11:40:29.553676image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 75959
30.3%
. 46662
18.6%
1 25391
 
10.1%
2 21165
 
8.4%
3 15751
 
6.3%
5 14062
 
5.6%
4 13695
 
5.5%
7 11236
 
4.5%
9 9362
 
3.7%
6 9353
 
3.7%
Other values (2) 7937
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 203889
81.4%
Other Punctuation 46662
 
18.6%
Dash Punctuation 22
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 75959
37.3%
1 25391
 
12.5%
2 21165
 
10.4%
3 15751
 
7.7%
5 14062
 
6.9%
4 13695
 
6.7%
7 11236
 
5.5%
9 9362
 
4.6%
6 9353
 
4.6%
8 7915
 
3.9%
Other Punctuation
ValueCountFrequency (%)
. 46662
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 250573
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 75959
30.3%
. 46662
18.6%
1 25391
 
10.1%
2 21165
 
8.4%
3 15751
 
6.3%
5 14062
 
5.6%
4 13695
 
5.5%
7 11236
 
4.5%
9 9362
 
3.7%
6 9353
 
3.7%
Other values (2) 7937
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 250573
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 75959
30.3%
. 46662
18.6%
1 25391
 
10.1%
2 21165
 
8.4%
3 15751
 
6.3%
5 14062
 
5.6%
4 13695
 
5.5%
7 11236
 
4.5%
9 9362
 
3.7%
6 9353
 
3.7%
Other values (2) 7937
 
3.2%
Distinct1534
Distinct (%)4.9%
Missing573266
Missing (%)94.8%
Memory size4.6 MiB
2025-01-14T11:40:29.770978image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.472658485
Min length3

Characters and Unicode

Total characters172137
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique401 ?
Unique (%)1.3%

Sample

1st row2040.0
2nd row240.0
3rd row165.0
4th row400.0
5th row1300.0
ValueCountFrequency (%)
3353.0 850
 
2.7%
2438.0 719
 
2.3%
1829.0 717
 
2.3%
1524.0 582
 
1.9%
2743.0 553
 
1.8%
427.0 467
 
1.5%
1200.0 465
 
1.5%
1372.0 453
 
1.4%
2134.0 424
 
1.3%
2499.0 416
 
1.3%
Other values (1523) 25808
82.0%
2025-01-14T11:40:30.058719image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 51748
30.1%
. 31454
18.3%
1 16786
 
9.8%
2 15345
 
8.9%
3 10880
 
6.3%
4 9719
 
5.6%
5 9555
 
5.6%
7 7998
 
4.6%
9 6255
 
3.6%
8 6241
 
3.6%
Other values (2) 6156
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 140671
81.7%
Other Punctuation 31454
 
18.3%
Dash Punctuation 12
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 51748
36.8%
1 16786
 
11.9%
2 15345
 
10.9%
3 10880
 
7.7%
4 9719
 
6.9%
5 9555
 
6.8%
7 7998
 
5.7%
9 6255
 
4.4%
8 6241
 
4.4%
6 6144
 
4.4%
Other Punctuation
ValueCountFrequency (%)
. 31454
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 172137
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 51748
30.1%
. 31454
18.3%
1 16786
 
9.8%
2 15345
 
8.9%
3 10880
 
6.3%
4 9719
 
5.6%
5 9555
 
5.6%
7 7998
 
4.6%
9 6255
 
3.6%
8 6241
 
3.6%
Other values (2) 6156
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 172137
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 51748
30.1%
. 31454
18.3%
1 16786
 
9.8%
2 15345
 
8.9%
3 10880
 
6.3%
4 9719
 
5.6%
5 9555
 
5.6%
7 7998
 
4.6%
9 6255
 
3.6%
8 6241
 
3.6%
Other values (2) 6156
 
3.6%

verbatimElevation
Text

Missing 

Distinct1024
Distinct (%)10.3%
Missing594785
Missing (%)98.4%
Memory size4.6 MiB
2025-01-14T11:40:30.269666image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length94
Median length31
Mean length8.088173125
Min length1

Characters and Unicode

Total characters80356
Distinct characters54
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique334 ?
Unique (%)3.4%

Sample

1st row140 meters
2nd row3900 feet
3rd row5940 feet
4th row180 meters
5th row3000 feet
ValueCountFrequency (%)
m 2783
 
14.5%
feet 2472
 
12.9%
meters 1521
 
7.9%
ft 1465
 
7.6%
1000 347
 
1.8%
level 318
 
1.7%
sea 318
 
1.7%
300 305
 
1.6%
near 276
 
1.4%
3200 236
 
1.2%
Other values (619) 9193
47.8%
2025-01-14T11:40:30.536930image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 16890
21.0%
e 9358
11.6%
9299
11.6%
t 5738
 
7.1%
m 5103
 
6.4%
f 4103
 
5.1%
1 4089
 
5.1%
5 3791
 
4.7%
2 2913
 
3.6%
. 2459
 
3.1%
Other values (44) 16613
20.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 36344
45.2%
Lowercase Letter 30894
38.4%
Space Separator 9299
 
11.6%
Other Punctuation 2946
 
3.7%
Dash Punctuation 765
 
1.0%
Uppercase Letter 44
 
0.1%
Open Punctuation 23
 
< 0.1%
Close Punctuation 23
 
< 0.1%
Math Symbol 18
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 9358
30.3%
t 5738
18.6%
m 5103
16.5%
f 4103
13.3%
r 1891
 
6.1%
s 1851
 
6.0%
a 854
 
2.8%
l 695
 
2.2%
n 346
 
1.1%
v 331
 
1.1%
Other values (12) 624
 
2.0%
Decimal Number
ValueCountFrequency (%)
0 16890
46.5%
1 4089
 
11.3%
5 3791
 
10.4%
2 2913
 
8.0%
3 2121
 
5.8%
4 1908
 
5.2%
6 1282
 
3.5%
7 1249
 
3.4%
8 1174
 
3.2%
9 927
 
2.6%
Uppercase Letter
ValueCountFrequency (%)
F 30
68.2%
N 5
 
11.4%
L 3
 
6.8%
A 2
 
4.5%
P 1
 
2.3%
B 1
 
2.3%
S 1
 
2.3%
W 1
 
2.3%
Other Punctuation
ValueCountFrequency (%)
. 2459
83.5%
' 338
 
11.5%
, 126
 
4.3%
& 13
 
0.4%
? 9
 
0.3%
/ 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 22
95.7%
[ 1
 
4.3%
Close Punctuation
ValueCountFrequency (%)
) 22
95.7%
] 1
 
4.3%
Math Symbol
ValueCountFrequency (%)
~ 17
94.4%
+ 1
 
5.6%
Space Separator
ValueCountFrequency (%)
9299
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 765
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 49418
61.5%
Latin 30938
38.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 9358
30.2%
t 5738
18.5%
m 5103
16.5%
f 4103
13.3%
r 1891
 
6.1%
s 1851
 
6.0%
a 854
 
2.8%
l 695
 
2.2%
n 346
 
1.1%
v 331
 
1.1%
Other values (20) 668
 
2.2%
Common
ValueCountFrequency (%)
0 16890
34.2%
9299
18.8%
1 4089
 
8.3%
5 3791
 
7.7%
2 2913
 
5.9%
. 2459
 
5.0%
3 2121
 
4.3%
4 1908
 
3.9%
6 1282
 
2.6%
7 1249
 
2.5%
Other values (14) 3417
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 80356
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 16890
21.0%
e 9358
11.6%
9299
11.6%
t 5738
 
7.1%
m 5103
 
6.4%
f 4103
 
5.1%
1 4089
 
5.1%
5 3791
 
4.7%
2 2913
 
3.6%
. 2459
 
3.1%
Other values (44) 16613
20.7%

minimumDepthInMeters
Text

Missing 

Distinct13
Distinct (%)37.1%
Missing604685
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:30.610698image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length5
Mean length5.114285714
Min length3

Characters and Unicode

Total characters179
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)20.0%

Sample

1st row0.0
2nd row250.0
3rd row0.0
4th rowArgia orichalcea
5th row370.0
ValueCountFrequency (%)
250.0 9
25.0%
0.0 6
16.7%
880.0 6
16.7%
370.0 3
 
8.3%
1707.0 2
 
5.6%
775.0 2
 
5.6%
argia 1
 
2.8%
orichalcea 1
 
2.8%
359.0 1
 
2.8%
1400.0 1
 
2.8%
Other values (4) 4
11.1%
2025-01-14T11:40:30.734323image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 68
38.0%
. 34
19.0%
5 13
 
7.3%
7 13
 
7.3%
8 12
 
6.7%
2 9
 
5.0%
3 6
 
3.4%
1 4
 
2.2%
a 3
 
1.7%
4 2
 
1.1%
Other values (12) 15
 
8.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 129
72.1%
Other Punctuation 34
 
19.0%
Lowercase Letter 14
 
7.8%
Space Separator 1
 
0.6%
Uppercase Letter 1
 
0.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 68
52.7%
5 13
 
10.1%
7 13
 
10.1%
8 12
 
9.3%
2 9
 
7.0%
3 6
 
4.7%
1 4
 
3.1%
4 2
 
1.6%
9 1
 
0.8%
6 1
 
0.8%
Lowercase Letter
ValueCountFrequency (%)
a 3
21.4%
c 2
14.3%
i 2
14.3%
r 2
14.3%
g 1
 
7.1%
o 1
 
7.1%
h 1
 
7.1%
l 1
 
7.1%
e 1
 
7.1%
Other Punctuation
ValueCountFrequency (%)
. 34
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 164
91.6%
Latin 15
 
8.4%

Most frequent character per script

Common
ValueCountFrequency (%)
0 68
41.5%
. 34
20.7%
5 13
 
7.9%
7 13
 
7.9%
8 12
 
7.3%
2 9
 
5.5%
3 6
 
3.7%
1 4
 
2.4%
4 2
 
1.2%
1
 
0.6%
Other values (2) 2
 
1.2%
Latin
ValueCountFrequency (%)
a 3
20.0%
c 2
13.3%
i 2
13.3%
r 2
13.3%
g 1
 
6.7%
o 1
 
6.7%
h 1
 
6.7%
l 1
 
6.7%
e 1
 
6.7%
A 1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 179
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 68
38.0%
. 34
19.0%
5 13
 
7.3%
7 13
 
7.3%
8 12
 
6.7%
2 9
 
5.0%
3 6
 
3.4%
1 4
 
2.2%
a 3
 
1.7%
4 2
 
1.1%
Other values (12) 15
 
8.4%

maximumDepthInMeters
Text

Missing 

Distinct4
Distinct (%)36.4%
Missing604709
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:30.784592image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length5
Mean length5.090909091
Min length5

Characters and Unicode

Total characters56
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)18.2%

Sample

1st row220.0
2nd row220.0
3rd row370.0
4th row220.0
5th row1400.0
ValueCountFrequency (%)
220.0 6
54.5%
370.0 3
27.3%
1400.0 1
 
9.1%
500.0 1
 
9.1%
2025-01-14T11:40:30.886354image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 24
42.9%
2 12
21.4%
. 11
19.6%
3 3
 
5.4%
7 3
 
5.4%
1 1
 
1.8%
4 1
 
1.8%
5 1
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 45
80.4%
Other Punctuation 11
 
19.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 24
53.3%
2 12
26.7%
3 3
 
6.7%
7 3
 
6.7%
1 1
 
2.2%
4 1
 
2.2%
5 1
 
2.2%
Other Punctuation
ValueCountFrequency (%)
. 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 56
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 24
42.9%
2 12
21.4%
. 11
19.6%
3 3
 
5.4%
7 3
 
5.4%
1 1
 
1.8%
4 1
 
1.8%
5 1
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 24
42.9%
2 12
21.4%
. 11
19.6%
3 3
 
5.4%
7 3
 
5.4%
1 1
 
1.8%
4 1
 
1.8%
5 1
 
1.8%

verbatimDepth
Text

Constant  Missing 

Distinct1
Distinct (%)16.7%
Missing604714
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:30.933629image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length25
Mean length25
Min length25

Characters and Unicode

Total characters150
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row220m inside cave entrance
2nd row220m inside cave entrance
3rd row220m inside cave entrance
4th row220m inside cave entrance
5th row220m inside cave entrance
ValueCountFrequency (%)
220m 6
25.0%
inside 6
25.0%
cave 6
25.0%
entrance 6
25.0%
2025-01-14T11:40:31.036735image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 24
16.0%
18
12.0%
n 18
12.0%
2 12
8.0%
i 12
8.0%
c 12
8.0%
a 12
8.0%
0 6
 
4.0%
m 6
 
4.0%
s 6
 
4.0%
Other values (4) 24
16.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 114
76.0%
Space Separator 18
 
12.0%
Decimal Number 18
 
12.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 24
21.1%
n 18
15.8%
i 12
10.5%
c 12
10.5%
a 12
10.5%
m 6
 
5.3%
s 6
 
5.3%
d 6
 
5.3%
v 6
 
5.3%
t 6
 
5.3%
Decimal Number
ValueCountFrequency (%)
2 12
66.7%
0 6
33.3%
Space Separator
ValueCountFrequency (%)
18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 114
76.0%
Common 36
 
24.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 24
21.1%
n 18
15.8%
i 12
10.5%
c 12
10.5%
a 12
10.5%
m 6
 
5.3%
s 6
 
5.3%
d 6
 
5.3%
v 6
 
5.3%
t 6
 
5.3%
Common
ValueCountFrequency (%)
18
50.0%
2 12
33.3%
0 6
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 150
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 24
16.0%
18
12.0%
n 18
12.0%
2 12
8.0%
i 12
8.0%
c 12
8.0%
a 12
8.0%
0 6
 
4.0%
m 6
 
4.0%
s 6
 
4.0%
Other values (4) 24
16.0%

locationRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604719
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:31.086462image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters19
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowGarrison, Rosser W.
ValueCountFrequency (%)
garrison 1
33.3%
rosser 1
33.3%
w 1
33.3%
2025-01-14T11:40:31.186163image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 3
15.8%
s 3
15.8%
o 2
10.5%
2
10.5%
G 1
 
5.3%
a 1
 
5.3%
i 1
 
5.3%
n 1
 
5.3%
, 1
 
5.3%
R 1
 
5.3%
Other values (3) 3
15.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12
63.2%
Uppercase Letter 3
 
15.8%
Space Separator 2
 
10.5%
Other Punctuation 2
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 3
25.0%
s 3
25.0%
o 2
16.7%
a 1
 
8.3%
i 1
 
8.3%
n 1
 
8.3%
e 1
 
8.3%
Uppercase Letter
ValueCountFrequency (%)
G 1
33.3%
R 1
33.3%
W 1
33.3%
Other Punctuation
ValueCountFrequency (%)
, 1
50.0%
. 1
50.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15
78.9%
Common 4
 
21.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 3
20.0%
s 3
20.0%
o 2
13.3%
G 1
 
6.7%
a 1
 
6.7%
i 1
 
6.7%
n 1
 
6.7%
R 1
 
6.7%
e 1
 
6.7%
W 1
 
6.7%
Common
ValueCountFrequency (%)
2
50.0%
, 1
25.0%
. 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 3
15.8%
s 3
15.8%
o 2
10.5%
2
10.5%
G 1
 
5.3%
a 1
 
5.3%
i 1
 
5.3%
n 1
 
5.3%
, 1
 
5.3%
R 1
 
5.3%
Other values (3) 3
15.8%

decimalLatitude
Text

Missing 

Distinct38000
Distinct (%)11.9%
Missing285696
Missing (%)47.2%
Memory size4.6 MiB
2025-01-14T11:40:31.398592image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length65
Median length7
Mean length6.690020187
Min length3

Characters and Unicode

Total characters2134277
Distinct characters36
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15792 ?
Unique (%)5.0%

Sample

1st row9.91378
2nd row57.18
3rd row36.5787
4th row15.5864
5th row45.4838
ValueCountFrequency (%)
39.6891 5053
 
1.6%
60.75 3840
 
1.2%
60.7493 2462
 
0.8%
40.0925 2379
 
0.7%
38.02 2014
 
0.6%
42.7299 1697
 
0.5%
37.23 1343
 
0.4%
40.015 1287
 
0.4%
42.78 1170
 
0.4%
38.9559 1141
 
0.4%
Other values (37318) 296643
93.0%
2025-01-14T11:40:31.691950image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 319023
14.9%
3 273800
12.8%
4 209113
9.8%
1 188958
8.9%
2 172350
8.1%
9 169602
7.9%
7 165610
7.8%
8 159004
7.5%
5 153218
7.2%
6 152394
7.1%
Other values (26) 171205
8.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1774064
83.1%
Other Punctuation 319028
 
14.9%
Dash Punctuation 41124
 
1.9%
Lowercase Letter 49
 
< 0.1%
Uppercase Letter 7
 
< 0.1%
Space Separator 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 9
18.4%
o 6
12.2%
n 5
10.2%
i 4
8.2%
e 4
8.2%
r 4
8.2%
t 4
8.2%
d 3
 
6.1%
g 2
 
4.1%
p 2
 
4.1%
Other values (6) 6
12.2%
Decimal Number
ValueCountFrequency (%)
3 273800
15.4%
4 209113
11.8%
1 188958
10.7%
2 172350
9.7%
9 169602
9.6%
7 165610
9.3%
8 159004
9.0%
5 153218
8.6%
6 152394
8.6%
0 130015
7.3%
Uppercase Letter
ValueCountFrequency (%)
A 2
28.6%
Z 1
14.3%
O 1
14.3%
I 1
14.3%
E 1
14.3%
C 1
14.3%
Other Punctuation
ValueCountFrequency (%)
. 319023
> 99.9%
, 5
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 41124
100.0%
Space Separator
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2134221
> 99.9%
Latin 56
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 9
16.1%
o 6
10.7%
n 5
 
8.9%
i 4
 
7.1%
e 4
 
7.1%
r 4
 
7.1%
t 4
 
7.1%
d 3
 
5.4%
A 2
 
3.6%
g 2
 
3.6%
Other values (12) 13
23.2%
Common
ValueCountFrequency (%)
. 319023
14.9%
3 273800
12.8%
4 209113
9.8%
1 188958
8.9%
2 172350
8.1%
9 169602
7.9%
7 165610
7.8%
8 159004
7.5%
5 153218
7.2%
6 152394
7.1%
Other values (4) 171149
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2134277
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 319023
14.9%
3 273800
12.8%
4 209113
9.8%
1 188958
8.9%
2 172350
8.1%
9 169602
7.9%
7 165610
7.8%
8 159004
7.5%
5 153218
7.2%
6 152394
7.1%
Other values (26) 171205
8.0%

decimalLongitude
Text

Missing 

Distinct36959
Distinct (%)11.6%
Missing285696
Missing (%)47.2%
Memory size4.6 MiB
2025-01-14T11:40:31.933967image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length8
Mean length7.477506395
Min length3

Characters and Unicode

Total characters2385504
Distinct characters18
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15086 ?
Unique (%)4.7%

Sample

1st row-83.6744
2nd row-170.27
3rd row-75.8881
4th row-61.4739
5th row-75.9727
ValueCountFrequency (%)
105.644 5103
 
1.6%
139.5 3838
 
1.2%
139.504 2462
 
0.8%
105.358 2379
 
0.7%
87.8123 1697
 
0.5%
119.93 1404
 
0.4%
105.27 1361
 
0.4%
80.4178 1322
 
0.4%
0.365 1301
 
0.4%
87.76 1163
 
0.4%
Other values (36449) 296994
93.1%
2025-01-14T11:40:32.238431image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 319023
13.4%
1 292933
12.3%
- 270766
11.4%
7 217532
9.1%
8 193895
8.1%
6 165418
6.9%
5 162723
6.8%
3 158480
6.6%
2 156818
6.6%
9 154493
6.5%
Other values (8) 293423
12.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1795707
75.3%
Other Punctuation 319023
 
13.4%
Dash Punctuation 270766
 
11.4%
Lowercase Letter 7
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 292933
16.3%
7 217532
12.1%
8 193895
10.8%
6 165418
9.2%
5 162723
9.1%
3 158480
8.8%
2 156818
8.7%
9 154493
8.6%
4 148399
8.3%
0 145016
8.1%
Lowercase Letter
ValueCountFrequency (%)
i 2
28.6%
a 2
28.6%
n 1
14.3%
m 1
14.3%
l 1
14.3%
Other Punctuation
ValueCountFrequency (%)
. 319023
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 270766
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2385496
> 99.9%
Latin 8
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
. 319023
13.4%
1 292933
12.3%
- 270766
11.4%
7 217532
9.1%
8 193895
8.1%
6 165418
6.9%
5 162723
6.8%
3 158480
6.6%
2 156818
6.6%
9 154493
6.5%
Other values (2) 293415
12.3%
Latin
ValueCountFrequency (%)
i 2
25.0%
a 2
25.0%
A 1
12.5%
n 1
12.5%
m 1
12.5%
l 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2385504
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 319023
13.4%
1 292933
12.3%
- 270766
11.4%
7 217532
9.1%
8 193895
8.1%
6 165418
6.9%
5 162723
6.8%
3 158480
6.6%
2 156818
6.6%
9 154493
6.5%
Other values (8) 293423
12.3%

geodeticDatum
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing578337
Missing (%)95.6%
Memory size4.6 MiB
2025-01-14T11:40:32.306815image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length18
Mean length17.50456734
Min length5

Characters and Unicode

Total characters461823
Distinct characters30
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowWGS 84 (EPSG:4326)
2nd rowWGS 84 (EPSG:4326)
3rd rowWGS 84 (EPSG:4326)
4th rowWGS 84 (EPSG:4326)
5th rowWGS 84 (EPSG:4326)
ValueCountFrequency (%)
wgs 25014
32.6%
84 25014
32.6%
epsg:4326 25008
32.6%
wgs84 754
 
1.0%
nad83 399
 
0.5%
epsg:4269 399
 
0.5%
wgs40 214
 
0.3%
arthropoda 1
 
< 0.1%
1973-05-08 1
 
< 0.1%
2025-01-14T11:40:32.420961image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
G 51389
11.1%
S 51389
11.1%
4 51389
11.1%
50421
10.9%
8 26168
 
5.7%
W 25982
 
5.6%
3 25408
 
5.5%
( 25407
 
5.5%
E 25407
 
5.5%
P 25407
 
5.5%
Other values (20) 103456
22.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 180772
39.1%
Decimal Number 154398
33.4%
Space Separator 50421
 
10.9%
Open Punctuation 25407
 
5.5%
Other Punctuation 25407
 
5.5%
Close Punctuation 25407
 
5.5%
Lowercase Letter 9
 
< 0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 51389
33.3%
8 26168
16.9%
3 25408
16.5%
2 25407
16.5%
6 25407
16.5%
9 400
 
0.3%
0 216
 
0.1%
1 1
 
< 0.1%
7 1
 
< 0.1%
5 1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
G 51389
28.4%
S 51389
28.4%
W 25982
14.4%
E 25407
14.1%
P 25407
14.1%
A 400
 
0.2%
D 399
 
0.2%
N 399
 
0.2%
Lowercase Letter
ValueCountFrequency (%)
r 2
22.2%
o 2
22.2%
t 1
11.1%
h 1
11.1%
p 1
11.1%
d 1
11.1%
a 1
11.1%
Space Separator
ValueCountFrequency (%)
50421
100.0%
Open Punctuation
ValueCountFrequency (%)
( 25407
100.0%
Other Punctuation
ValueCountFrequency (%)
: 25407
100.0%
Close Punctuation
ValueCountFrequency (%)
) 25407
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 281042
60.9%
Latin 180781
39.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
G 51389
28.4%
S 51389
28.4%
W 25982
14.4%
E 25407
14.1%
P 25407
14.1%
A 400
 
0.2%
D 399
 
0.2%
N 399
 
0.2%
r 2
 
< 0.1%
o 2
 
< 0.1%
Other values (5) 5
 
< 0.1%
Common
ValueCountFrequency (%)
4 51389
18.3%
50421
17.9%
8 26168
9.3%
3 25408
9.0%
( 25407
9.0%
: 25407
9.0%
2 25407
9.0%
6 25407
9.0%
) 25407
9.0%
9 400
 
0.1%
Other values (5) 221
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 461823
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
G 51389
11.1%
S 51389
11.1%
4 51389
11.1%
50421
10.9%
8 26168
 
5.7%
W 25982
 
5.6%
3 25408
 
5.5%
( 25407
 
5.5%
E 25407
 
5.5%
P 25407
 
5.5%
Other values (20) 103456
22.4%
Distinct1494
Distinct (%)12.5%
Missing592766
Missing (%)98.0%
Memory size4.6 MiB
2025-01-14T11:40:32.605493image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length4
Mean length4.138698344
Min length2

Characters and Unicode

Total characters49474
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique746 ?
Unique (%)6.2%

Sample

1st row931
2nd row10206
3rd row6642
4th row3036
5th row301
ValueCountFrequency (%)
3036 1744
 
14.6%
301 466
 
3.9%
34239 426
 
3.6%
1189 258
 
2.2%
20000 247
 
2.1%
3048 220
 
1.8%
15000 199
 
1.7%
52150 194
 
1.6%
14563 162
 
1.4%
9346 135
 
1.1%
Other values (1484) 7903
66.1%
2025-01-14T11:40:33.011147image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 9238
18.7%
3 8252
16.7%
1 6353
12.8%
2 4894
9.9%
6 4647
9.4%
4 3910
7.9%
5 3501
 
7.1%
9 3065
 
6.2%
8 2862
 
5.8%
7 2745
 
5.5%
Other values (7) 7
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 49467
> 99.9%
Lowercase Letter 6
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 9238
18.7%
3 8252
16.7%
1 6353
12.8%
2 4894
9.9%
6 4647
9.4%
4 3910
7.9%
5 3501
 
7.1%
9 3065
 
6.2%
8 2862
 
5.8%
7 2745
 
5.5%
Lowercase Letter
ValueCountFrequency (%)
n 1
16.7%
s 1
16.7%
e 1
16.7%
c 1
16.7%
t 1
16.7%
a 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
I 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 49467
> 99.9%
Latin 7
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 9238
18.7%
3 8252
16.7%
1 6353
12.8%
2 4894
9.9%
6 4647
9.4%
4 3910
7.9%
5 3501
 
7.1%
9 3065
 
6.2%
8 2862
 
5.8%
7 2745
 
5.5%
Latin
ValueCountFrequency (%)
I 1
14.3%
n 1
14.3%
s 1
14.3%
e 1
14.3%
c 1
14.3%
t 1
14.3%
a 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 49474
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 9238
18.7%
3 8252
16.7%
1 6353
12.8%
2 4894
9.9%
6 4647
9.4%
4 3910
7.9%
5 3501
 
7.1%
9 3065
 
6.2%
8 2862
 
5.8%
7 2745
 
5.5%
Other values (7) 7
 
< 0.1%

coordinatePrecision
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing604717
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:33.075451image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length3
Mean length4
Min length2

Characters and Unicode

Total characters12
Distinct characters11
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowOdonata
2nd row69
3rd row128
ValueCountFrequency (%)
odonata 1
33.3%
69 1
33.3%
128 1
33.3%
2025-01-14T11:40:33.173870image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
16.7%
O 1
8.3%
d 1
8.3%
o 1
8.3%
n 1
8.3%
t 1
8.3%
6 1
8.3%
9 1
8.3%
1 1
8.3%
2 1
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
50.0%
Decimal Number 5
41.7%
Uppercase Letter 1
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
33.3%
d 1
16.7%
o 1
16.7%
n 1
16.7%
t 1
16.7%
Decimal Number
ValueCountFrequency (%)
6 1
20.0%
9 1
20.0%
1 1
20.0%
2 1
20.0%
8 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
O 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
58.3%
Common 5
41.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
28.6%
O 1
14.3%
d 1
14.3%
o 1
14.3%
n 1
14.3%
t 1
14.3%
Common
ValueCountFrequency (%)
6 1
20.0%
9 1
20.0%
1 1
20.0%
2 1
20.0%
8 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
16.7%
O 1
8.3%
d 1
8.3%
o 1
8.3%
n 1
8.3%
t 1
8.3%
6 1
8.3%
9 1
8.3%
1 1
8.3%
2 1
8.3%

pointRadiusSpatialFit
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing604718
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:33.221341image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length2.5
Mean length2.5
Min length2

Characters and Unicode

Total characters5
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row69
2nd row128
ValueCountFrequency (%)
69 1
50.0%
128 1
50.0%
2025-01-14T11:40:33.321274image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 1
20.0%
9 1
20.0%
1 1
20.0%
2 1
20.0%
8 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 1
20.0%
9 1
20.0%
1 1
20.0%
2 1
20.0%
8 1
20.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 1
20.0%
9 1
20.0%
1 1
20.0%
2 1
20.0%
8 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 1
20.0%
9 1
20.0%
1 1
20.0%
2 1
20.0%
8 1
20.0%

verbatimCoordinates
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing604718
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:33.373313image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length9
Mean length9
Min length4

Characters and Unicode

Total characters18
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowCoenagrionidae
2nd row1973
ValueCountFrequency (%)
coenagrionidae 1
50.0%
1973 1
50.0%
2025-01-14T11:40:33.481495image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 2
11.1%
e 2
11.1%
n 2
11.1%
a 2
11.1%
i 2
11.1%
C 1
 
5.6%
g 1
 
5.6%
r 1
 
5.6%
d 1
 
5.6%
1 1
 
5.6%
Other values (3) 3
16.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13
72.2%
Decimal Number 4
 
22.2%
Uppercase Letter 1
 
5.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 2
15.4%
e 2
15.4%
n 2
15.4%
a 2
15.4%
i 2
15.4%
g 1
7.7%
r 1
7.7%
d 1
7.7%
Decimal Number
ValueCountFrequency (%)
1 1
25.0%
9 1
25.0%
7 1
25.0%
3 1
25.0%
Uppercase Letter
ValueCountFrequency (%)
C 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14
77.8%
Common 4
 
22.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 2
14.3%
e 2
14.3%
n 2
14.3%
a 2
14.3%
i 2
14.3%
C 1
7.1%
g 1
7.1%
r 1
7.1%
d 1
7.1%
Common
ValueCountFrequency (%)
1 1
25.0%
9 1
25.0%
7 1
25.0%
3 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 2
11.1%
e 2
11.1%
n 2
11.1%
a 2
11.1%
i 2
11.1%
C 1
 
5.6%
g 1
 
5.6%
r 1
 
5.6%
d 1
 
5.6%
1 1
 
5.6%
Other values (3) 3
16.7%

verbatimLatitude
Text

Missing 

Distinct10290
Distinct (%)12.6%
Missing523062
Missing (%)86.5%
Memory size4.6 MiB
2025-01-14T11:40:33.673784image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length9
Mean length8.943949154
Min length1

Characters and Unicode

Total characters730345
Distinct characters54
Distinct categories14 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3874 ?
Unique (%)4.7%

Sample

1st rowN36.578717
2nd row0 deg 50' 00" N
3rd row3 deg. 21.1' N
4th row10 32' S
5th row39.079276
ValueCountFrequency (%)
n 12202
 
10.3%
deg 3779
 
3.2%
s 3061
 
2.6%
40.014986 1227
 
1.0%
38.955944 1139
 
1.0%
39 889
 
0.7%
10 854
 
0.7%
12 805
 
0.7%
40.001652 790
 
0.7%
38 783
 
0.7%
Other values (9199) 93254
78.5%
2025-01-14T11:40:33.943320image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 79859
10.9%
. 76634
10.5%
4 74529
10.2%
1 56463
 
7.7%
2 53355
 
7.3%
8 51998
 
7.1%
0 49153
 
6.7%
9 48559
 
6.6%
5 48310
 
6.6%
6 41823
 
5.7%
Other values (44) 149662
20.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 544502
74.6%
Other Punctuation 90638
 
12.4%
Space Separator 37125
 
5.1%
Uppercase Letter 29498
 
4.0%
Lowercase Letter 18739
 
2.6%
Other Symbol 5504
 
0.8%
Dash Punctuation 4170
 
0.6%
Open Punctuation 49
 
< 0.1%
Close Punctuation 49
 
< 0.1%
Other Letter 32
 
< 0.1%
Other values (4) 39
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 6147
32.8%
g 5893
31.4%
e 5794
30.9%
r 659
 
3.5%
s 198
 
1.1%
t 11
 
0.1%
n 10
 
0.1%
o 10
 
0.1%
h 5
 
< 0.1%
l 3
 
< 0.1%
Other values (5) 9
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
3 79859
14.7%
4 74529
13.7%
1 56463
10.4%
2 53355
9.8%
8 51998
9.5%
0 49153
9.0%
9 48559
8.9%
5 48310
8.9%
6 41823
7.7%
7 40453
7.4%
Other Punctuation
ValueCountFrequency (%)
. 76634
84.5%
' 13062
 
14.4%
" 813
 
0.9%
: 79
 
0.1%
16
 
< 0.1%
15
 
< 0.1%
& 13
 
< 0.1%
, 4
 
< 0.1%
\ 2
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N 22689
76.9%
S 6676
 
22.6%
W 81
 
0.3%
E 30
 
0.1%
B 16
 
0.1%
D 5
 
< 0.1%
M 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 48
98.0%
( 1
 
2.0%
Close Punctuation
ValueCountFrequency (%)
] 48
98.0%
) 1
 
2.0%
Modifier Symbol
ValueCountFrequency (%)
˚ 16
64.0%
´ 9
36.0%
Space Separator
ValueCountFrequency (%)
37125
100.0%
Other Symbol
ValueCountFrequency (%)
° 5504
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4170
100.0%
Other Letter
ValueCountFrequency (%)
º 32
100.0%
Final Punctuation
ValueCountFrequency (%)
8
100.0%
Math Symbol
ValueCountFrequency (%)
~ 4
100.0%
Nonspacing Mark
ValueCountFrequency (%)
̊ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 682074
93.4%
Latin 48269
 
6.6%
Inherited 2
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
3 79859
11.7%
. 76634
11.2%
4 74529
10.9%
1 56463
8.3%
2 53355
7.8%
8 51998
7.6%
0 49153
7.2%
9 48559
7.1%
5 48310
7.1%
6 41823
6.1%
Other values (20) 101391
14.9%
Latin
ValueCountFrequency (%)
N 22689
47.0%
S 6676
 
13.8%
d 6147
 
12.7%
g 5893
 
12.2%
e 5794
 
12.0%
r 659
 
1.4%
s 198
 
0.4%
W 81
 
0.2%
º 32
 
0.1%
E 30
 
0.1%
Other values (13) 70
 
0.1%
Inherited
ValueCountFrequency (%)
̊ 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 724743
99.2%
None 5545
 
0.8%
Punctuation 39
 
< 0.1%
Modifier Letters 16
 
< 0.1%
Diacriticals 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 79859
11.0%
. 76634
10.6%
4 74529
10.3%
1 56463
 
7.8%
2 53355
 
7.4%
8 51998
 
7.2%
0 49153
 
6.8%
9 48559
 
6.7%
5 48310
 
6.7%
6 41823
 
5.8%
Other values (36) 144060
19.9%
None
ValueCountFrequency (%)
° 5504
99.3%
º 32
 
0.6%
´ 9
 
0.2%
Punctuation
ValueCountFrequency (%)
16
41.0%
15
38.5%
8
20.5%
Modifier Letters
ValueCountFrequency (%)
˚ 16
100.0%
Diacriticals
ValueCountFrequency (%)
̊ 2
100.0%

verbatimLongitude
Text

Missing 

Distinct10183
Distinct (%)12.5%
Missing523032
Missing (%)86.5%
Memory size4.6 MiB
2025-01-14T11:40:34.147847image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length28
Mean length9.817243659
Min length1

Characters and Unicode

Total characters801951
Distinct characters54
Distinct categories14 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3804 ?
Unique (%)4.7%

Sample

1st rowW75.88805
2nd row66 deg 09' 44" W
3rd row59 deg. 40.5' W
4th row62 48' W
5th row-76.59802
ValueCountFrequency (%)
w 13038
 
11.0%
deg 3758
 
3.2%
e 2358
 
2.0%
105.270546 1260
 
1.1%
76.94553 1139
 
1.0%
76 1012
 
0.9%
59 834
 
0.7%
105.307491 790
 
0.7%
70 782
 
0.7%
77.254426 778
 
0.7%
Other values (9264) 92725
78.3%
2025-01-14T11:40:34.439927image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 78854
 
9.8%
. 76662
 
9.6%
1 65451
 
8.2%
8 61925
 
7.7%
0 59445
 
7.4%
5 56396
 
7.0%
6 55542
 
6.9%
- 52771
 
6.6%
2 48852
 
6.1%
3 48668
 
6.1%
Other values (44) 197385
24.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 568041
70.8%
Other Punctuation 90510
 
11.3%
Dash Punctuation 52771
 
6.6%
Space Separator 36786
 
4.6%
Uppercase Letter 29452
 
3.7%
Lowercase Letter 18732
 
2.3%
Other Symbol 5488
 
0.7%
Close Punctuation 51
 
< 0.1%
Open Punctuation 49
 
< 0.1%
Other Letter 32
 
< 0.1%
Other values (4) 39
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 6143
32.8%
g 5886
31.4%
e 5802
31.0%
r 655
 
3.5%
s 196
 
1.0%
w 23
 
0.1%
t 9
 
< 0.1%
o 6
 
< 0.1%
n 4
 
< 0.1%
l 3
 
< 0.1%
Other values (3) 5
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
7 78854
13.9%
1 65451
11.5%
8 61925
10.9%
0 59445
10.5%
5 56396
9.9%
6 55542
9.8%
2 48852
8.6%
3 48668
8.6%
4 47518
8.4%
9 45390
8.0%
Other Punctuation
ValueCountFrequency (%)
. 76662
84.7%
' 12870
 
14.2%
" 843
 
0.9%
: 79
 
0.1%
16
 
< 0.1%
15
 
< 0.1%
& 13
 
< 0.1%
, 8
 
< 0.1%
; 3
 
< 0.1%
? 1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
W 24235
82.3%
E 5087
 
17.3%
N 54
 
0.2%
S 53
 
0.2%
L 16
 
0.1%
O 5
 
< 0.1%
D 1
 
< 0.1%
M 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 49
96.1%
) 2
 
3.9%
Open Punctuation
ValueCountFrequency (%)
[ 47
95.9%
( 2
 
4.1%
Modifier Symbol
ValueCountFrequency (%)
˚ 16
64.0%
´ 9
36.0%
Dash Punctuation
ValueCountFrequency (%)
- 52771
100.0%
Space Separator
ValueCountFrequency (%)
36786
100.0%
Other Symbol
ValueCountFrequency (%)
° 5488
100.0%
Other Letter
ValueCountFrequency (%)
º 32
100.0%
Final Punctuation
ValueCountFrequency (%)
8
100.0%
Math Symbol
ValueCountFrequency (%)
~ 4
100.0%
Nonspacing Mark
ValueCountFrequency (%)
̊ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 753733
94.0%
Latin 48216
 
6.0%
Inherited 2
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
7 78854
10.5%
. 76662
10.2%
1 65451
8.7%
8 61925
8.2%
0 59445
 
7.9%
5 56396
 
7.5%
6 55542
 
7.4%
- 52771
 
7.0%
2 48852
 
6.5%
3 48668
 
6.5%
Other values (21) 149167
19.8%
Latin
ValueCountFrequency (%)
W 24235
50.3%
d 6143
 
12.7%
g 5886
 
12.2%
e 5802
 
12.0%
E 5087
 
10.6%
r 655
 
1.4%
s 196
 
0.4%
N 54
 
0.1%
S 53
 
0.1%
º 32
 
0.1%
Other values (12) 73
 
0.2%
Inherited
ValueCountFrequency (%)
̊ 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 796365
99.3%
None 5529
 
0.7%
Punctuation 39
 
< 0.1%
Modifier Letters 16
 
< 0.1%
Diacriticals 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 78854
9.9%
. 76662
 
9.6%
1 65451
 
8.2%
8 61925
 
7.8%
0 59445
 
7.5%
5 56396
 
7.1%
6 55542
 
7.0%
- 52771
 
6.6%
2 48852
 
6.1%
3 48668
 
6.1%
Other values (36) 191799
24.1%
None
ValueCountFrequency (%)
° 5488
99.3%
º 32
 
0.6%
´ 9
 
0.2%
Punctuation
ValueCountFrequency (%)
16
41.0%
15
38.5%
8
20.5%
Modifier Letters
ValueCountFrequency (%)
˚ 16
100.0%
Diacriticals
ValueCountFrequency (%)
̊ 2
100.0%
Distinct3
Distinct (%)100.0%
Missing604717
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:34.510389image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length8
Mean length12.66666667
Min length7

Characters and Unicode

Total characters38
Distinct characters24
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowDegrees Minutes Seconds
2nd row9 March
3rd row8.v.1973
ValueCountFrequency (%)
degrees 1
16.7%
minutes 1
16.7%
seconds 1
16.7%
9 1
16.7%
march 1
16.7%
8.v.1973 1
16.7%
2025-01-14T11:40:34.626509image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 5
 
13.2%
s 3
 
7.9%
3
 
7.9%
c 2
 
5.3%
9 2
 
5.3%
r 2
 
5.3%
M 2
 
5.3%
n 2
 
5.3%
. 2
 
5.3%
7 1
 
2.6%
Other values (14) 14
36.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 23
60.5%
Decimal Number 6
 
15.8%
Uppercase Letter 4
 
10.5%
Space Separator 3
 
7.9%
Other Punctuation 2
 
5.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 5
21.7%
s 3
13.0%
c 2
 
8.7%
r 2
 
8.7%
n 2
 
8.7%
v 1
 
4.3%
h 1
 
4.3%
a 1
 
4.3%
d 1
 
4.3%
o 1
 
4.3%
Other values (4) 4
17.4%
Decimal Number
ValueCountFrequency (%)
9 2
33.3%
7 1
16.7%
1 1
16.7%
8 1
16.7%
3 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
M 2
50.0%
D 1
25.0%
S 1
25.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 27
71.1%
Common 11
28.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 5
18.5%
s 3
11.1%
c 2
 
7.4%
r 2
 
7.4%
M 2
 
7.4%
n 2
 
7.4%
v 1
 
3.7%
h 1
 
3.7%
a 1
 
3.7%
D 1
 
3.7%
Other values (7) 7
25.9%
Common
ValueCountFrequency (%)
3
27.3%
9 2
18.2%
. 2
18.2%
7 1
 
9.1%
1 1
 
9.1%
8 1
 
9.1%
3 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 5
 
13.2%
s 3
 
7.9%
3
 
7.9%
c 2
 
5.3%
9 2
 
5.3%
r 2
 
5.3%
M 2
 
5.3%
n 2
 
5.3%
. 2
 
5.3%
7 1
 
2.6%
Other values (14) 14
36.8%

verbatimSRS
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604719
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:34.672217image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowArgia
ValueCountFrequency (%)
argia 1
100.0%
2025-01-14T11:40:34.778563image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 1
20.0%
r 1
20.0%
g 1
20.0%
i 1
20.0%
a 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4
80.0%
Uppercase Letter 1
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 1
25.0%
g 1
25.0%
i 1
25.0%
a 1
25.0%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1
20.0%
r 1
20.0%
g 1
20.0%
i 1
20.0%
a 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1
20.0%
r 1
20.0%
g 1
20.0%
i 1
20.0%
a 1
20.0%

footprintSpatialFit
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604719
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:34.827337image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length22
Mean length22
Min length22

Characters and Unicode

Total characters22
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowGynacantha membranalis
ValueCountFrequency (%)
gynacantha 1
50.0%
membranalis 1
50.0%
2025-01-14T11:40:34.931356image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 5
22.7%
n 3
13.6%
m 2
 
9.1%
G 1
 
4.5%
y 1
 
4.5%
c 1
 
4.5%
t 1
 
4.5%
h 1
 
4.5%
1
 
4.5%
e 1
 
4.5%
Other values (5) 5
22.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20
90.9%
Uppercase Letter 1
 
4.5%
Space Separator 1
 
4.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5
25.0%
n 3
15.0%
m 2
 
10.0%
y 1
 
5.0%
c 1
 
5.0%
t 1
 
5.0%
h 1
 
5.0%
e 1
 
5.0%
b 1
 
5.0%
r 1
 
5.0%
Other values (3) 3
15.0%
Uppercase Letter
ValueCountFrequency (%)
G 1
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 21
95.5%
Common 1
 
4.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5
23.8%
n 3
14.3%
m 2
 
9.5%
G 1
 
4.8%
y 1
 
4.8%
c 1
 
4.8%
t 1
 
4.8%
h 1
 
4.8%
e 1
 
4.8%
b 1
 
4.8%
Other values (4) 4
19.0%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 5
22.7%
n 3
13.6%
m 2
 
9.1%
G 1
 
4.5%
y 1
 
4.5%
c 1
 
4.5%
t 1
 
4.5%
h 1
 
4.5%
1
 
4.5%
e 1
 
4.5%
Other values (5) 5
22.7%

georeferencedBy
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604719
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:34.977295image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st roworichalcea
ValueCountFrequency (%)
orichalcea 1
100.0%
2025-01-14T11:40:35.080059image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 2
20.0%
a 2
20.0%
o 1
10.0%
r 1
10.0%
i 1
10.0%
h 1
10.0%
l 1
10.0%
e 1
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 2
20.0%
a 2
20.0%
o 1
10.0%
r 1
10.0%
i 1
10.0%
h 1
10.0%
l 1
10.0%
e 1
10.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 2
20.0%
a 2
20.0%
o 1
10.0%
r 1
10.0%
i 1
10.0%
h 1
10.0%
l 1
10.0%
e 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 2
20.0%
a 2
20.0%
o 1
10.0%
r 1
10.0%
i 1
10.0%
h 1
10.0%
l 1
10.0%
e 1
10.0%

georeferenceProtocol
Text

Missing 

Distinct64
Distinct (%)< 0.1%
Missing366819
Missing (%)60.7%
Memory size4.6 MiB
2025-01-14T11:40:35.157933image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length72
Median length12
Mean length10.94749497
Min length3

Characters and Unicode

Total characters2604420
Distinct characters61
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)< 0.1%

Sample

1st rowGoogle Maps
2nd rowGoogle Earth
3rd rowGoogle Earth
4th rowGEOLocate
5th rowGoogle Earth
ValueCountFrequency (%)
google 163403
40.4%
earth 120779
29.8%
geolocate 70758
17.5%
maps 42650
 
10.5%
gps 1516
 
0.4%
coordinates 782
 
0.2%
centroid 781
 
0.2%
geonames 719
 
0.2%
from 711
 
0.2%
country 671
 
0.2%
Other values (105) 2061
 
0.5%
2025-01-14T11:40:35.326470image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 402623
15.5%
e 238641
9.2%
a 237508
9.1%
G 236572
9.1%
t 194824
7.5%
E 191441
7.4%
l 169506
 
6.5%
166930
 
6.4%
g 163835
 
6.3%
r 124382
 
4.8%
Other values (51) 478158
18.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1824030
70.0%
Uppercase Letter 612259
 
23.5%
Space Separator 166930
 
6.4%
Decimal Number 941
 
< 0.1%
Other Punctuation 250
 
< 0.1%
Dash Punctuation 10
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 402623
22.1%
e 238641
13.1%
a 237508
13.0%
t 194824
10.7%
l 169506
9.3%
g 163835
9.0%
r 124382
 
6.8%
h 120880
 
6.6%
c 72658
 
4.0%
s 44356
 
2.4%
Other values (14) 54817
 
3.0%
Uppercase Letter
ValueCountFrequency (%)
G 236572
38.6%
E 191441
31.3%
O 70685
 
11.5%
L 65530
 
10.7%
M 42663
 
7.0%
S 1607
 
0.3%
P 1564
 
0.3%
C 982
 
0.2%
N 745
 
0.1%
B 158
 
< 0.1%
Other values (8) 312
 
0.1%
Decimal Number
ValueCountFrequency (%)
9 213
22.6%
1 200
21.3%
7 175
18.6%
2 170
18.1%
0 94
10.0%
6 48
 
5.1%
8 16
 
1.7%
4 14
 
1.5%
3 9
 
1.0%
5 2
 
0.2%
Other Punctuation
ValueCountFrequency (%)
, 85
34.0%
& 49
19.6%
/ 48
19.2%
. 43
17.2%
: 21
 
8.4%
" 2
 
0.8%
; 2
 
0.8%
Space Separator
ValueCountFrequency (%)
166930
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2436289
93.5%
Common 168131
 
6.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 402623
16.5%
e 238641
9.8%
a 237508
9.7%
G 236572
9.7%
t 194824
8.0%
E 191441
7.9%
l 169506
7.0%
g 163835
6.7%
r 124382
 
5.1%
h 120880
 
5.0%
Other values (32) 356077
14.6%
Common
ValueCountFrequency (%)
166930
99.3%
9 213
 
0.1%
1 200
 
0.1%
7 175
 
0.1%
2 170
 
0.1%
0 94
 
0.1%
, 85
 
0.1%
& 49
 
< 0.1%
/ 48
 
< 0.1%
6 48
 
< 0.1%
Other values (9) 119
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2604420
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 402623
15.5%
e 238641
9.2%
a 237508
9.1%
G 236572
9.1%
t 194824
7.5%
E 191441
7.4%
l 169506
 
6.5%
166930
 
6.4%
g 163835
 
6.3%
r 124382
 
4.8%
Other values (51) 478158
18.4%

georeferenceRemarks
Text

Missing 

Distinct1134
Distinct (%)13.4%
Missing596270
Missing (%)98.6%
Memory size4.6 MiB
2025-01-14T11:40:35.518810image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length200
Median length182
Mean length45.17183432
Min length10

Characters and Unicode

Total characters381702
Distinct characters69
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique400 ?
Unique (%)4.7%

Sample

1st rowCoordinate Uncertainty In Meters: 56182
2nd rowCoordinate Uncertainty In Meters: 49611
3rd rowCoordinate Uncertainty In Meters: 97700
4th rowCoordinate Uncertainty In Meters: 41787
5th rowCoordinate Uncertainty In Meters: 71236
ValueCountFrequency (%)
in 8280
17.4%
coordinate 8141
17.1%
meters 8141
17.1%
uncertainty 8141
17.1%
verbatim 1307
 
2.7%
coordinate-degrees 1307
 
2.7%
minutes 1307
 
2.7%
3792 274
 
0.6%
the 221
 
0.5%
6066 174
 
0.4%
Other values (1273) 10425
21.8%
2025-01-14T11:40:35.795690image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 42275
 
11.1%
39268
 
10.3%
t 37520
 
9.8%
n 36171
 
9.5%
r 29384
 
7.7%
i 21348
 
5.6%
o 20139
 
5.3%
a 19993
 
5.2%
s 11760
 
3.1%
d 9751
 
2.6%
Other values (59) 114093
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 255005
66.8%
Space Separator 39268
 
10.3%
Decimal Number 38776
 
10.2%
Uppercase Letter 37573
 
9.8%
Other Punctuation 9667
 
2.5%
Dash Punctuation 1342
 
0.4%
Open Punctuation 33
 
< 0.1%
Close Punctuation 33
 
< 0.1%
Initial Punctuation 2
 
< 0.1%
Final Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 42275
16.6%
t 37520
14.7%
n 36171
14.2%
r 29384
11.5%
i 21348
8.4%
o 20139
7.9%
a 19993
7.8%
s 11760
 
4.6%
d 9751
 
3.8%
c 8647
 
3.4%
Other values (16) 18017
7.1%
Uppercase Letter
ValueCountFrequency (%)
C 9647
25.7%
M 8188
21.8%
U 8175
21.8%
I 8162
21.7%
D 1329
 
3.5%
V 1307
 
3.5%
T 264
 
0.7%
N 88
 
0.2%
S 85
 
0.2%
G 82
 
0.2%
Other values (10) 246
 
0.7%
Decimal Number
ValueCountFrequency (%)
1 4555
11.7%
6 4451
11.5%
0 4424
11.4%
3 4272
11.0%
2 4116
10.6%
5 3998
10.3%
4 3413
8.8%
7 3301
8.5%
9 3147
8.1%
8 3099
8.0%
Other Punctuation
ValueCountFrequency (%)
: 8141
84.2%
; 1326
 
13.7%
, 101
 
1.0%
. 90
 
0.9%
' 5
 
0.1%
" 4
 
< 0.1%
Space Separator
ValueCountFrequency (%)
39268
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1342
100.0%
Open Punctuation
ValueCountFrequency (%)
( 33
100.0%
Close Punctuation
ValueCountFrequency (%)
) 33
100.0%
Initial Punctuation
ValueCountFrequency (%)
2
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 292578
76.7%
Common 89124
 
23.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 42275
14.4%
t 37520
12.8%
n 36171
12.4%
r 29384
10.0%
i 21348
 
7.3%
o 20139
 
6.9%
a 19993
 
6.8%
s 11760
 
4.0%
d 9751
 
3.3%
C 9647
 
3.3%
Other values (36) 54590
18.7%
Common
ValueCountFrequency (%)
39268
44.1%
: 8141
 
9.1%
1 4555
 
5.1%
6 4451
 
5.0%
0 4424
 
5.0%
3 4272
 
4.8%
2 4116
 
4.6%
5 3998
 
4.5%
4 3413
 
3.8%
7 3301
 
3.7%
Other values (13) 9185
 
10.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 381693
> 99.9%
None 5
 
< 0.1%
Punctuation 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 42275
 
11.1%
39268
 
10.3%
t 37520
 
9.8%
n 36171
 
9.5%
r 29384
 
7.7%
i 21348
 
5.6%
o 20139
 
5.3%
a 19993
 
5.2%
s 11760
 
3.1%
d 9751
 
2.6%
Other values (56) 114084
29.9%
None
ValueCountFrequency (%)
ñ 5
100.0%
Punctuation
ValueCountFrequency (%)
2
50.0%
2
50.0%

geologicalContextID
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing604716
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:35.871397image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length17
Mean length17.5
Min length4

Characters and Unicode

Total characters70
Distinct characters25
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowHagen in Selys
2nd rowBrazil, [Not Stated]
3rd rowUnited States, Florida, Pinellas
4th rowPeru
ValueCountFrequency (%)
hagen 1
9.1%
in 1
9.1%
selys 1
9.1%
brazil 1
9.1%
not 1
9.1%
stated 1
9.1%
united 1
9.1%
states 1
9.1%
florida 1
9.1%
pinellas 1
9.1%
2025-01-14T11:40:35.998191image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7
 
10.0%
e 7
 
10.0%
t 6
 
8.6%
a 6
 
8.6%
i 5
 
7.1%
l 5
 
7.1%
n 4
 
5.7%
S 3
 
4.3%
s 3
 
4.3%
, 3
 
4.3%
Other values (15) 21
30.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 48
68.6%
Uppercase Letter 10
 
14.3%
Space Separator 7
 
10.0%
Other Punctuation 3
 
4.3%
Close Punctuation 1
 
1.4%
Open Punctuation 1
 
1.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 7
14.6%
t 6
12.5%
a 6
12.5%
i 5
10.4%
l 5
10.4%
n 4
8.3%
s 3
6.2%
d 3
6.2%
r 3
6.2%
o 2
 
4.2%
Other values (4) 4
8.3%
Uppercase Letter
ValueCountFrequency (%)
S 3
30.0%
P 2
20.0%
F 1
 
10.0%
U 1
 
10.0%
H 1
 
10.0%
N 1
 
10.0%
B 1
 
10.0%
Space Separator
ValueCountFrequency (%)
7
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%
Close Punctuation
ValueCountFrequency (%)
] 1
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 58
82.9%
Common 12
 
17.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 7
12.1%
t 6
10.3%
a 6
10.3%
i 5
 
8.6%
l 5
 
8.6%
n 4
 
6.9%
S 3
 
5.2%
s 3
 
5.2%
d 3
 
5.2%
r 3
 
5.2%
Other values (11) 13
22.4%
Common
ValueCountFrequency (%)
7
58.3%
, 3
25.0%
] 1
 
8.3%
[ 1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 70
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7
 
10.0%
e 7
 
10.0%
t 6
 
8.6%
a 6
 
8.6%
i 5
 
7.1%
l 5
 
7.1%
n 4
 
5.7%
S 3
 
4.3%
s 3
 
4.3%
, 3
 
4.3%
Other values (15) 21
30.0%

earliestEonOrLowestEonothem
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604719
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:36.058577image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length61
Median length61
Mean length61
Min length61

Characters and Unicode

Total characters61
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowAnimalia, Arthropoda, Insecta, Odonata, Anisoptera, Aeshnidae
ValueCountFrequency (%)
animalia 1
16.7%
arthropoda 1
16.7%
insecta 1
16.7%
odonata 1
16.7%
anisoptera 1
16.7%
aeshnidae 1
16.7%
2025-01-14T11:40:36.173932image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8
13.1%
5
 
8.2%
n 5
 
8.2%
, 5
 
8.2%
A 4
 
6.6%
e 4
 
6.6%
o 4
 
6.6%
t 4
 
6.6%
i 4
 
6.6%
r 3
 
4.9%
Other values (9) 15
24.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 45
73.8%
Uppercase Letter 6
 
9.8%
Space Separator 5
 
8.2%
Other Punctuation 5
 
8.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8
17.8%
n 5
11.1%
e 4
8.9%
o 4
8.9%
t 4
8.9%
i 4
8.9%
r 3
 
6.7%
d 3
 
6.7%
s 3
 
6.7%
h 2
 
4.4%
Other values (4) 5
11.1%
Uppercase Letter
ValueCountFrequency (%)
A 4
66.7%
I 1
 
16.7%
O 1
 
16.7%
Space Separator
ValueCountFrequency (%)
5
100.0%
Other Punctuation
ValueCountFrequency (%)
, 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 51
83.6%
Common 10
 
16.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8
15.7%
n 5
9.8%
A 4
7.8%
e 4
7.8%
o 4
7.8%
t 4
7.8%
i 4
7.8%
r 3
 
5.9%
d 3
 
5.9%
s 3
 
5.9%
Other values (7) 9
17.6%
Common
ValueCountFrequency (%)
5
50.0%
, 5
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 61
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8
13.1%
5
 
8.2%
n 5
 
8.2%
, 5
 
8.2%
A 4
 
6.6%
e 4
 
6.6%
o 4
 
6.6%
t 4
 
6.6%
i 4
 
6.6%
r 3
 
4.9%
Other values (9) 15
24.6%

latestEonOrHighestEonothem
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604719
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:36.221558image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowAnimalia
ValueCountFrequency (%)
animalia 1
100.0%
2025-01-14T11:40:36.337246image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 2
25.0%
a 2
25.0%
A 1
12.5%
n 1
12.5%
m 1
12.5%
l 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
87.5%
Uppercase Letter 1
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 2
28.6%
a 2
28.6%
n 1
14.3%
m 1
14.3%
l 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 2
25.0%
a 2
25.0%
A 1
12.5%
n 1
12.5%
m 1
12.5%
l 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 2
25.0%
a 2
25.0%
A 1
12.5%
n 1
12.5%
m 1
12.5%
l 1
12.5%

earliestEraOrLowestErathem
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604719
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:36.393839image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowArthropoda
ValueCountFrequency (%)
arthropoda 1
100.0%
2025-01-14T11:40:36.502164image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 2
20.0%
o 2
20.0%
A 1
10.0%
t 1
10.0%
h 1
10.0%
p 1
10.0%
d 1
10.0%
a 1
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9
90.0%
Uppercase Letter 1
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 2
22.2%
o 2
22.2%
t 1
11.1%
h 1
11.1%
p 1
11.1%
d 1
11.1%
a 1
11.1%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 2
20.0%
o 2
20.0%
A 1
10.0%
t 1
10.0%
h 1
10.0%
p 1
10.0%
d 1
10.0%
a 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 2
20.0%
o 2
20.0%
A 1
10.0%
t 1
10.0%
h 1
10.0%
p 1
10.0%
d 1
10.0%
a 1
10.0%

latestEraOrHighestErathem
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604719
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:36.550694image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowInsecta
ValueCountFrequency (%)
insecta 1
100.0%
2025-01-14T11:40:36.648169image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 1
14.3%
n 1
14.3%
s 1
14.3%
e 1
14.3%
c 1
14.3%
t 1
14.3%
a 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
85.7%
Uppercase Letter 1
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 1
16.7%
s 1
16.7%
e 1
16.7%
c 1
16.7%
t 1
16.7%
a 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
I 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 1
14.3%
n 1
14.3%
s 1
14.3%
e 1
14.3%
c 1
14.3%
t 1
14.3%
a 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 1
14.3%
n 1
14.3%
s 1
14.3%
e 1
14.3%
c 1
14.3%
t 1
14.3%
a 1
14.3%
Distinct4
Distinct (%)100.0%
Missing604716
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:36.703903image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length6.5
Mean length7.5
Min length4

Characters and Unicode

Total characters30
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowBrazil
2nd rowUnited States
3rd rowOdonata
4th rowPeru
ValueCountFrequency (%)
brazil 1
20.0%
united 1
20.0%
states 1
20.0%
odonata 1
20.0%
peru 1
20.0%
2025-01-14T11:40:36.821617image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4
13.3%
t 4
13.3%
e 3
 
10.0%
i 2
 
6.7%
n 2
 
6.7%
r 2
 
6.7%
d 2
 
6.7%
S 1
 
3.3%
P 1
 
3.3%
o 1
 
3.3%
Other values (8) 8
26.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 24
80.0%
Uppercase Letter 5
 
16.7%
Space Separator 1
 
3.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
16.7%
t 4
16.7%
e 3
12.5%
i 2
8.3%
n 2
8.3%
r 2
8.3%
d 2
8.3%
o 1
 
4.2%
s 1
 
4.2%
l 1
 
4.2%
Other values (2) 2
8.3%
Uppercase Letter
ValueCountFrequency (%)
S 1
20.0%
P 1
20.0%
O 1
20.0%
B 1
20.0%
U 1
20.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 29
96.7%
Common 1
 
3.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
13.8%
t 4
13.8%
e 3
10.3%
i 2
 
6.9%
n 2
 
6.9%
r 2
 
6.9%
d 2
 
6.9%
S 1
 
3.4%
P 1
 
3.4%
o 1
 
3.4%
Other values (7) 7
24.1%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4
13.3%
t 4
13.3%
e 3
 
10.0%
i 2
 
6.7%
n 2
 
6.7%
r 2
 
6.7%
d 2
 
6.7%
S 1
 
3.3%
P 1
 
3.3%
o 1
 
3.3%
Other values (8) 8
26.7%
Distinct3
Distinct (%)100.0%
Missing604717
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:36.888532image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length9
Mean length9.333333333
Min length7

Characters and Unicode

Total characters28
Distinct characters18
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row[Not Stated]
2nd rowFlorida
3rd rowAeshnidae
ValueCountFrequency (%)
not 1
25.0%
stated 1
25.0%
florida 1
25.0%
aeshnidae 1
25.0%
2025-01-14T11:40:37.011515image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 3
 
10.7%
a 3
 
10.7%
e 3
 
10.7%
d 3
 
10.7%
o 2
 
7.1%
i 2
 
7.1%
[ 1
 
3.6%
r 1
 
3.6%
h 1
 
3.6%
s 1
 
3.6%
Other values (8) 8
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 21
75.0%
Uppercase Letter 4
 
14.3%
Open Punctuation 1
 
3.6%
Close Punctuation 1
 
3.6%
Space Separator 1
 
3.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 3
14.3%
a 3
14.3%
e 3
14.3%
d 3
14.3%
o 2
9.5%
i 2
9.5%
r 1
 
4.8%
h 1
 
4.8%
s 1
 
4.8%
l 1
 
4.8%
Uppercase Letter
ValueCountFrequency (%)
A 1
25.0%
F 1
25.0%
N 1
25.0%
S 1
25.0%
Open Punctuation
ValueCountFrequency (%)
[ 1
100.0%
Close Punctuation
ValueCountFrequency (%)
] 1
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 25
89.3%
Common 3
 
10.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 3
12.0%
a 3
12.0%
e 3
12.0%
d 3
12.0%
o 2
 
8.0%
i 2
 
8.0%
r 1
 
4.0%
h 1
 
4.0%
s 1
 
4.0%
A 1
 
4.0%
Other values (5) 5
20.0%
Common
ValueCountFrequency (%)
[ 1
33.3%
] 1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 3
 
10.7%
a 3
 
10.7%
e 3
 
10.7%
d 3
 
10.7%
o 2
 
7.1%
i 2
 
7.1%
[ 1
 
3.6%
r 1
 
3.6%
h 1
 
3.6%
s 1
 
3.6%
Other values (8) 8
28.6%

latestEpochOrHighestSeries
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604719
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:37.060267image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPinellas
ValueCountFrequency (%)
pinellas 1
100.0%
2025-01-14T11:40:37.161013image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 2
25.0%
P 1
12.5%
i 1
12.5%
n 1
12.5%
e 1
12.5%
a 1
12.5%
s 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
87.5%
Uppercase Letter 1
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 2
28.6%
i 1
14.3%
n 1
14.3%
e 1
14.3%
a 1
14.3%
s 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
P 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 2
25.0%
P 1
12.5%
i 1
12.5%
n 1
12.5%
e 1
12.5%
a 1
12.5%
s 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 2
25.0%
P 1
12.5%
i 1
12.5%
n 1
12.5%
e 1
12.5%
a 1
12.5%
s 1
12.5%
Distinct3
Distinct (%)100.0%
Missing604717
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:37.219119image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length14
Mean length19
Min length12

Characters and Unicode

Total characters57
Distinct characters28
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row[Not Stated]
2nd rowSt. Petersburg
3rd rowHuaru Valley, 90 mi. N. of Lima
ValueCountFrequency (%)
not 1
9.1%
stated 1
9.1%
st 1
9.1%
petersburg 1
9.1%
huaru 1
9.1%
valley 1
9.1%
90 1
9.1%
mi 1
9.1%
n 1
9.1%
of 1
9.1%
2025-01-14T11:40:37.343889image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8
 
14.0%
t 5
 
8.8%
a 4
 
7.0%
e 4
 
7.0%
u 3
 
5.3%
. 3
 
5.3%
r 3
 
5.3%
i 2
 
3.5%
o 2
 
3.5%
S 2
 
3.5%
Other values (18) 21
36.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 33
57.9%
Space Separator 8
 
14.0%
Uppercase Letter 8
 
14.0%
Other Punctuation 4
 
7.0%
Decimal Number 2
 
3.5%
Open Punctuation 1
 
1.8%
Close Punctuation 1
 
1.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 5
15.2%
a 4
12.1%
e 4
12.1%
u 3
9.1%
r 3
9.1%
i 2
 
6.1%
o 2
 
6.1%
l 2
 
6.1%
m 2
 
6.1%
f 1
 
3.0%
Other values (5) 5
15.2%
Uppercase Letter
ValueCountFrequency (%)
S 2
25.0%
N 2
25.0%
V 1
12.5%
H 1
12.5%
P 1
12.5%
L 1
12.5%
Other Punctuation
ValueCountFrequency (%)
. 3
75.0%
, 1
 
25.0%
Decimal Number
ValueCountFrequency (%)
9 1
50.0%
0 1
50.0%
Space Separator
ValueCountFrequency (%)
8
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 1
100.0%
Close Punctuation
ValueCountFrequency (%)
] 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 41
71.9%
Common 16
 
28.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 5
12.2%
a 4
 
9.8%
e 4
 
9.8%
u 3
 
7.3%
r 3
 
7.3%
i 2
 
4.9%
o 2
 
4.9%
S 2
 
4.9%
l 2
 
4.9%
m 2
 
4.9%
Other values (11) 12
29.3%
Common
ValueCountFrequency (%)
8
50.0%
. 3
 
18.8%
, 1
 
6.2%
9 1
 
6.2%
[ 1
 
6.2%
0 1
 
6.2%
] 1
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 57
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8
 
14.0%
t 5
 
8.8%
a 4
 
7.0%
e 4
 
7.0%
u 3
 
5.3%
. 3
 
5.3%
r 3
 
5.3%
i 2
 
3.5%
o 2
 
3.5%
S 2
 
3.5%
Other values (18) 21
36.8%

lowestBiostratigraphicZone
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604719
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:37.391840image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowGynacantha
ValueCountFrequency (%)
gynacantha 1
100.0%
2025-01-14T11:40:37.507413image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3
30.0%
n 2
20.0%
G 1
 
10.0%
y 1
 
10.0%
c 1
 
10.0%
t 1
 
10.0%
h 1
 
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9
90.0%
Uppercase Letter 1
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
33.3%
n 2
22.2%
y 1
 
11.1%
c 1
 
11.1%
t 1
 
11.1%
h 1
 
11.1%
Uppercase Letter
ValueCountFrequency (%)
G 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
30.0%
n 2
20.0%
G 1
 
10.0%
y 1
 
10.0%
c 1
 
10.0%
t 1
 
10.0%
h 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3
30.0%
n 2
20.0%
G 1
 
10.0%
y 1
 
10.0%
c 1
 
10.0%
t 1
 
10.0%
h 1
 
10.0%

formation
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604719
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:37.554709image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters11
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowmembranalis
ValueCountFrequency (%)
membranalis 1
100.0%
2025-01-14T11:40:37.656881image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
m 2
18.2%
a 2
18.2%
e 1
9.1%
b 1
9.1%
r 1
9.1%
n 1
9.1%
l 1
9.1%
i 1
9.1%
s 1
9.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m 2
18.2%
a 2
18.2%
e 1
9.1%
b 1
9.1%
r 1
9.1%
n 1
9.1%
l 1
9.1%
i 1
9.1%
s 1
9.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 11
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
m 2
18.2%
a 2
18.2%
e 1
9.1%
b 1
9.1%
r 1
9.1%
n 1
9.1%
l 1
9.1%
i 1
9.1%
s 1
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
m 2
18.2%
a 2
18.2%
e 1
9.1%
b 1
9.1%
r 1
9.1%
n 1
9.1%
l 1
9.1%
i 1
9.1%
s 1
9.1%
Distinct16
Distinct (%)1.1%
Missing603282
Missing (%)99.8%
Memory size4.6 MiB
2025-01-14T11:40:37.712491image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length9
Mean length5.812934631
Min length2

Characters and Unicode

Total characters8359
Distinct characters24
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)0.2%

Sample

1st rownear
2nd rowuncertain
3rd rownear
4th rownear
5th rowcf.
ValueCountFrequency (%)
near 466
31.6%
uncertain 459
31.1%
cf 238
16.1%
group 113
 
7.7%
subgroup 80
 
5.4%
complex 26
 
1.8%
aff 21
 
1.4%
sp 21
 
1.4%
n 15
 
1.0%
sensu 11
 
0.7%
Other values (6) 24
 
1.6%
2025-01-14T11:40:37.834465image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 1418
17.0%
r 1132
13.5%
e 962
11.5%
a 948
11.3%
u 743
8.9%
c 733
8.8%
t 481
 
5.8%
i 471
 
5.6%
f 280
 
3.3%
p 240
 
2.9%
Other values (14) 951
11.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8138
97.4%
Other Punctuation 180
 
2.2%
Space Separator 36
 
0.4%
Uppercase Letter 5
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 1418
17.4%
r 1132
13.9%
e 962
11.8%
a 948
11.6%
u 743
9.1%
c 733
9.0%
t 481
 
5.9%
i 471
 
5.8%
f 280
 
3.4%
p 240
 
2.9%
Other values (9) 730
9.0%
Uppercase Letter
ValueCountFrequency (%)
C 2
40.0%
B 2
40.0%
K 1
20.0%
Other Punctuation
ValueCountFrequency (%)
. 180
100.0%
Space Separator
ValueCountFrequency (%)
36
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8143
97.4%
Common 216
 
2.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 1418
17.4%
r 1132
13.9%
e 962
11.8%
a 948
11.6%
u 743
9.1%
c 733
9.0%
t 481
 
5.9%
i 471
 
5.8%
f 280
 
3.4%
p 240
 
2.9%
Other values (12) 735
9.0%
Common
ValueCountFrequency (%)
. 180
83.3%
36
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8359
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 1418
17.0%
r 1132
13.5%
e 962
11.5%
a 948
11.3%
u 743
8.9%
c 733
8.8%
t 481
 
5.8%
i 471
 
5.6%
f 280
 
3.3%
p 240
 
2.9%
Other values (14) 951
11.4%

typeStatus
Text

Missing 

Distinct62
Distinct (%)0.1%
Missing486142
Missing (%)80.4%
Memory size4.6 MiB
2025-01-14T11:40:37.897777image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length28
Median length8
Mean length7.058653376
Min length1

Characters and Unicode

Total characters837001
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)< 0.1%

Sample

1st rowParatype
2nd rowType
3rd rowHolotype
4th rowType
5th rowPrimary Syntype
ValueCountFrequency (%)
holotype 54132
44.3%
type 32982
27.0%
syntype 13149
 
10.8%
paratype 11029
 
9.0%
lectotype 5242
 
4.3%
primary 3223
 
2.6%
allotype 1092
 
0.9%
syntypes 429
 
0.4%
neotype 316
 
0.3%
cotype 298
 
0.2%
Other values (14) 175
 
0.1%
2025-01-14T11:40:38.023987image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
y 135631
16.2%
e 124524
14.9%
p 118840
14.2%
o 115382
13.8%
t 91216
10.9%
l 56450
6.7%
H 54135
 
6.5%
T 32989
 
3.9%
a 25558
 
3.1%
r 17612
 
2.1%
Other values (16) 64664
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 711090
85.0%
Uppercase Letter 122059
 
14.6%
Space Separator 3489
 
0.4%
Other Punctuation 363
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
y 135631
19.1%
e 124524
17.5%
p 118840
16.7%
o 115382
16.2%
t 91216
12.8%
l 56450
7.9%
a 25558
 
3.6%
r 17612
 
2.5%
n 13588
 
1.9%
c 5368
 
0.8%
Other values (5) 6921
 
1.0%
Uppercase Letter
ValueCountFrequency (%)
H 54135
44.4%
T 32989
27.0%
P 14391
 
11.8%
S 13578
 
11.1%
L 5247
 
4.3%
A 1094
 
0.9%
N 322
 
0.3%
C 303
 
0.2%
Other Punctuation
ValueCountFrequency (%)
; 254
70.0%
? 109
30.0%
Space Separator
ValueCountFrequency (%)
3489
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 833149
99.5%
Common 3852
 
0.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
y 135631
16.3%
e 124524
14.9%
p 118840
14.3%
o 115382
13.8%
t 91216
10.9%
l 56450
6.8%
H 54135
 
6.5%
T 32989
 
4.0%
a 25558
 
3.1%
r 17612
 
2.1%
Other values (13) 60812
7.3%
Common
ValueCountFrequency (%)
3489
90.6%
; 254
 
6.6%
? 109
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 837001
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
y 135631
16.2%
e 124524
14.9%
p 118840
14.2%
o 115382
13.8%
t 91216
10.9%
l 56450
6.7%
H 54135
 
6.5%
T 32989
 
3.9%
a 25558
 
3.1%
r 17612
 
2.1%
Other values (16) 64664
7.7%

identifiedBy
Text

Missing 

Distinct2736
Distinct (%)1.8%
Missing455024
Missing (%)75.2%
Memory size4.6 MiB
2025-01-14T11:40:38.219011image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length150
Median length106
Mean length27.7928268
Min length2

Characters and Unicode

Total characters4160475
Distinct characters71
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique933 ?
Unique (%)0.6%

Sample

1st rowWestfall, M. J., Jr.
2nd rowDonnelly, Thomas W.
3rd rowFlint, Oliver S., Jr., (ENT), Smithsonian Institution - National Museum of Natural History (UNITED STATES)
4th rowKormann, K.
5th rowDeMarmels
ValueCountFrequency (%)
w 28134
 
4.4%
united 24412
 
3.8%
states 24411
 
3.8%
22738
 
3.5%
of 22001
 
3.4%
s 21919
 
3.4%
smithsonian 21911
 
3.4%
institution 21911
 
3.4%
museum 21368
 
3.3%
natural 21090
 
3.3%
Other values (2399) 413103
64.2%
2025-01-14T11:40:38.685541image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
493302
 
11.9%
i 251011
 
6.0%
o 231967
 
5.6%
t 230937
 
5.6%
n 230507
 
5.5%
a 200387
 
4.8%
, 193571
 
4.7%
r 182856
 
4.4%
. 170385
 
4.1%
s 166946
 
4.0%
Other values (61) 1808606
43.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2295528
55.2%
Uppercase Letter 890542
 
21.4%
Space Separator 493302
 
11.9%
Other Punctuation 364806
 
8.8%
Close Punctuation 46602
 
1.1%
Open Punctuation 46602
 
1.1%
Dash Punctuation 23093
 
0.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 251011
10.9%
o 231967
10.1%
t 230937
10.1%
n 230507
10.0%
a 200387
8.7%
r 182856
8.0%
s 166946
7.3%
l 162370
7.1%
e 157410
6.9%
u 114807
 
5.0%
Other values (23) 366330
16.0%
Uppercase Letter
ValueCountFrequency (%)
T 112744
12.7%
S 105400
11.8%
N 90473
10.2%
E 79714
 
9.0%
M 58710
 
6.6%
D 53045
 
6.0%
I 47398
 
5.3%
A 45391
 
5.1%
W 36649
 
4.1%
J 36232
 
4.1%
Other values (16) 224786
25.2%
Other Punctuation
ValueCountFrequency (%)
, 193571
53.1%
. 170385
46.7%
& 690
 
0.2%
' 157
 
< 0.1%
; 2
 
< 0.1%
? 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 46600
> 99.9%
] 2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 46600
> 99.9%
[ 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
493302
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 23093
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3186070
76.6%
Common 974405
 
23.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 251011
 
7.9%
o 231967
 
7.3%
t 230937
 
7.2%
n 230507
 
7.2%
a 200387
 
6.3%
r 182856
 
5.7%
s 166946
 
5.2%
l 162370
 
5.1%
e 157410
 
4.9%
u 114807
 
3.6%
Other values (49) 1256872
39.4%
Common
ValueCountFrequency (%)
493302
50.6%
, 193571
 
19.9%
. 170385
 
17.5%
) 46600
 
4.8%
( 46600
 
4.8%
- 23093
 
2.4%
& 690
 
0.1%
' 157
 
< 0.1%
[ 2
 
< 0.1%
] 2
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4160438
> 99.9%
None 37
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
493302
 
11.9%
i 251011
 
6.0%
o 231967
 
5.6%
t 230937
 
5.6%
n 230507
 
5.5%
a 200387
 
4.8%
, 193571
 
4.7%
r 182856
 
4.4%
. 170385
 
4.1%
s 166946
 
4.0%
Other values (54) 1808569
43.5%
None
ValueCountFrequency (%)
á 9
24.3%
ń 9
24.3%
ż 9
24.3%
ö 7
18.9%
ü 1
 
2.7%
è 1
 
2.7%
ä 1
 
2.7%

identifiedByID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing604718
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:38.742385image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7.5
Mean length7.5
Min length7

Characters and Unicode

Total characters15
Distinct characters10
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row27.7731
2nd row-4.55006
ValueCountFrequency (%)
27.7731 1
50.0%
4.55006 1
50.0%
2025-01-14T11:40:38.862407image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 3
20.0%
. 2
13.3%
5 2
13.3%
0 2
13.3%
2 1
 
6.7%
3 1
 
6.7%
1 1
 
6.7%
- 1
 
6.7%
4 1
 
6.7%
6 1
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12
80.0%
Other Punctuation 2
 
13.3%
Dash Punctuation 1
 
6.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 3
25.0%
5 2
16.7%
0 2
16.7%
2 1
 
8.3%
3 1
 
8.3%
1 1
 
8.3%
4 1
 
8.3%
6 1
 
8.3%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 15
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7 3
20.0%
. 2
13.3%
5 2
13.3%
0 2
13.3%
2 1
 
6.7%
3 1
 
6.7%
1 1
 
6.7%
- 1
 
6.7%
4 1
 
6.7%
6 1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 3
20.0%
. 2
13.3%
5 2
13.3%
0 2
13.3%
2 1
 
6.7%
3 1
 
6.7%
1 1
 
6.7%
- 1
 
6.7%
4 1
 
6.7%
6 1
 
6.7%

dateIdentified
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing604718
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:38.914559image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7
Min length6

Characters and Unicode

Total characters14
Distinct characters8
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row-82.64
2nd row-76.1874
ValueCountFrequency (%)
82.64 1
50.0%
76.1874 1
50.0%
2025-01-14T11:40:39.025869image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 2
14.3%
8 2
14.3%
. 2
14.3%
6 2
14.3%
4 2
14.3%
7 2
14.3%
2 1
7.1%
1 1
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10
71.4%
Dash Punctuation 2
 
14.3%
Other Punctuation 2
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 2
20.0%
6 2
20.0%
4 2
20.0%
7 2
20.0%
2 1
10.0%
1 1
10.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 14
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 2
14.3%
8 2
14.3%
. 2
14.3%
6 2
14.3%
4 2
14.3%
7 2
14.3%
2 1
7.1%
1 1
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 2
14.3%
8 2
14.3%
. 2
14.3%
6 2
14.3%
4 2
14.3%
7 2
14.3%
2 1
7.1%
1 1
7.1%

identificationReferences
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604719
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:39.077146image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length18
Mean length18
Min length18

Characters and Unicode

Total characters18
Distinct characters14
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowWGS 84 (EPSG:4326)
ValueCountFrequency (%)
wgs 1
33.3%
84 1
33.3%
epsg:4326 1
33.3%
2025-01-14T11:40:39.183366image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
G 2
11.1%
S 2
11.1%
2
11.1%
4 2
11.1%
W 1
 
5.6%
8 1
 
5.6%
( 1
 
5.6%
E 1
 
5.6%
P 1
 
5.6%
: 1
 
5.6%
Other values (4) 4
22.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7
38.9%
Decimal Number 6
33.3%
Space Separator 2
 
11.1%
Open Punctuation 1
 
5.6%
Other Punctuation 1
 
5.6%
Close Punctuation 1
 
5.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
G 2
28.6%
S 2
28.6%
W 1
14.3%
E 1
14.3%
P 1
14.3%
Decimal Number
ValueCountFrequency (%)
4 2
33.3%
8 1
16.7%
3 1
16.7%
2 1
16.7%
6 1
16.7%
Space Separator
ValueCountFrequency (%)
2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 11
61.1%
Latin 7
38.9%

Most frequent character per script

Common
ValueCountFrequency (%)
2
18.2%
4 2
18.2%
8 1
9.1%
( 1
9.1%
: 1
9.1%
3 1
9.1%
2 1
9.1%
6 1
9.1%
) 1
9.1%
Latin
ValueCountFrequency (%)
G 2
28.6%
S 2
28.6%
W 1
14.3%
E 1
14.3%
P 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
G 2
11.1%
S 2
11.1%
2
11.1%
4 2
11.1%
W 1
 
5.6%
8 1
 
5.6%
( 1
 
5.6%
E 1
 
5.6%
P 1
 
5.6%
: 1
 
5.6%
Other values (4) 4
22.2%
Distinct245072
Distinct (%)40.8%
Missing4631
Missing (%)0.8%
Memory size4.6 MiB
2025-01-14T11:40:39.450422image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length68
Median length61
Mean length20.77041739
Min length3

Characters and Unicode

Total characters12464099
Distinct characters81
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique201386 ?
Unique (%)33.6%

Sample

1st rowCamponotus (Myrmosericus) rufoglaucus cinctella var. rufigenis
2nd rowAthrips mesoleuca
3rd rowParanthrene asilipennis
4th rowAcanthagrion trilobatum
5th rowCalathus nanulus
ValueCountFrequency (%)
bombus 69597
 
5.3%
sp 44400
 
3.4%
pyrobombus 21249
 
1.6%
xylocopa 12224
 
0.9%
unidentified 9030
 
0.7%
argia 8665
 
0.7%
apis 8603
 
0.6%
enallagma 7977
 
0.6%
crambus 7970
 
0.6%
ischnura 7458
 
0.6%
Other values (130820) 1127419
85.1%
2025-01-14T11:40:39.791573image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1254120
 
10.1%
i 1043361
 
8.4%
s 971373
 
7.8%
o 842893
 
6.8%
e 820899
 
6.6%
724503
 
5.8%
r 712805
 
5.7%
l 623128
 
5.0%
u 614998
 
4.9%
n 589887
 
4.7%
Other values (71) 4266132
34.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10815150
86.8%
Space Separator 724503
 
5.8%
Uppercase Letter 692195
 
5.6%
Open Punctuation 92284
 
0.7%
Close Punctuation 92282
 
0.7%
Other Punctuation 46370
 
0.4%
Decimal Number 742
 
< 0.1%
Connector Punctuation 312
 
< 0.1%
Dash Punctuation 259
 
< 0.1%
Final Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1254120
11.6%
i 1043361
 
9.6%
s 971373
 
9.0%
o 842893
 
7.8%
e 820899
 
7.6%
r 712805
 
6.6%
l 623128
 
5.8%
u 614998
 
5.7%
n 589887
 
5.5%
t 542919
 
5.0%
Other values (18) 2798767
25.9%
Uppercase Letter
ValueCountFrequency (%)
P 97591
14.1%
B 85599
12.4%
A 75796
11.0%
C 69821
10.1%
S 43674
 
6.3%
E 42648
 
6.2%
L 33325
 
4.8%
M 31766
 
4.6%
T 31189
 
4.5%
H 29108
 
4.2%
Other values (16) 151678
21.9%
Decimal Number
ValueCountFrequency (%)
1 216
29.1%
9 110
14.8%
0 93
12.5%
2 79
 
10.6%
3 67
 
9.0%
4 55
 
7.4%
6 44
 
5.9%
5 30
 
4.0%
7 30
 
4.0%
8 18
 
2.4%
Other Punctuation
ValueCountFrequency (%)
. 46204
99.6%
? 109
 
0.2%
# 34
 
0.1%
/ 14
 
< 0.1%
, 4
 
< 0.1%
; 2
 
< 0.1%
' 2
 
< 0.1%
! 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 92226
99.9%
[ 58
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 92224
99.9%
] 58
 
0.1%
Space Separator
ValueCountFrequency (%)
724503
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 312
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 259
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11507345
92.3%
Common 956754
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1254120
 
10.9%
i 1043361
 
9.1%
s 971373
 
8.4%
o 842893
 
7.3%
e 820899
 
7.1%
r 712805
 
6.2%
l 623128
 
5.4%
u 614998
 
5.3%
n 589887
 
5.1%
t 542919
 
4.7%
Other values (44) 3490962
30.3%
Common
ValueCountFrequency (%)
724503
75.7%
( 92226
 
9.6%
) 92224
 
9.6%
. 46204
 
4.8%
_ 312
 
< 0.1%
- 259
 
< 0.1%
1 216
 
< 0.1%
9 110
 
< 0.1%
? 109
 
< 0.1%
0 93
 
< 0.1%
Other values (17) 498
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12464077
> 99.9%
None 21
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1254120
 
10.1%
i 1043361
 
8.4%
s 971373
 
7.8%
o 842893
 
6.8%
e 820899
 
6.6%
724503
 
5.8%
r 712805
 
5.7%
l 623128
 
5.0%
u 614998
 
4.9%
n 589887
 
4.7%
Other values (68) 4266110
34.2%
None
ValueCountFrequency (%)
ö 19
90.5%
ñ 2
 
9.5%
Punctuation
ValueCountFrequency (%)
1
100.0%

originalNameUsage
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing604719
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:39.853332image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters12
Distinct characters11
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowGoogle Earth
ValueCountFrequency (%)
google 1
50.0%
earth 1
50.0%
2025-01-14T11:40:39.959904image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 2
16.7%
G 1
8.3%
g 1
8.3%
l 1
8.3%
e 1
8.3%
1
8.3%
E 1
8.3%
a 1
8.3%
r 1
8.3%
t 1
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9
75.0%
Uppercase Letter 2
 
16.7%
Space Separator 1
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 2
22.2%
g 1
11.1%
l 1
11.1%
e 1
11.1%
a 1
11.1%
r 1
11.1%
t 1
11.1%
h 1
11.1%
Uppercase Letter
ValueCountFrequency (%)
G 1
50.0%
E 1
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11
91.7%
Common 1
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 2
18.2%
G 1
9.1%
g 1
9.1%
l 1
9.1%
e 1
9.1%
E 1
9.1%
a 1
9.1%
r 1
9.1%
t 1
9.1%
h 1
9.1%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 2
16.7%
G 1
8.3%
g 1
8.3%
l 1
8.3%
e 1
8.3%
1
8.3%
E 1
8.3%
a 1
8.3%
r 1
8.3%
t 1
8.3%
Distinct3454
Distinct (%)0.6%
Missing4650
Missing (%)0.8%
Memory size4.6 MiB
2025-01-14T11:40:40.115353image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length97
Median length91
Mean length62.39120769
Min length9

Characters and Unicode

Total characters37439092
Distinct characters61
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique574 ?
Unique (%)0.1%

Sample

1st rowAnimalia, Arthropoda, Insecta, Hymenoptera, Formicidae, Formicinae
2nd rowAnimalia, Arthropoda, Insecta, Lepidoptera, Gelechiidae, Gelechiinae
3rd rowAnimalia, Arthropoda, Insecta, Lepidoptera, Sesiidae, Sesiinae
4th rowAnimalia, Arthropoda, Insecta, Odonata, Zygoptera, Coenagrionidae
5th rowAnimalia, Arthropoda, Insecta, Coleoptera, Carabidae
ValueCountFrequency (%)
arthropoda 599790
17.3%
animalia 598420
17.3%
insecta 588007
17.0%
hymenoptera 146523
 
4.2%
odonata 117300
 
3.4%
lepidoptera 99955
 
2.9%
apidae 82945
 
2.4%
diptera 73546
 
2.1%
coleoptera 72087
 
2.1%
apinae 63529
 
1.8%
Other values (2936) 1026199
29.6%
2025-01-14T11:40:40.350736image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4571468
12.2%
e 2938748
 
7.8%
2868231
 
7.7%
, 2867865
 
7.7%
i 2865509
 
7.7%
o 2433279
 
6.5%
r 2317205
 
6.2%
t 2192393
 
5.9%
n 2160394
 
5.8%
p 1690401
 
4.5%
Other values (51) 10533599
28.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 28235044
75.4%
Uppercase Letter 3467869
 
9.3%
Space Separator 2868231
 
7.7%
Other Punctuation 2867934
 
7.7%
Decimal Number 10
 
< 0.1%
Connector Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4571468
16.2%
e 2938748
10.4%
i 2865509
10.1%
o 2433279
8.6%
r 2317205
8.2%
t 2192393
7.8%
n 2160394
7.7%
p 1690401
 
6.0%
d 1537978
 
5.4%
l 1128105
 
4.0%
Other values (16) 4399564
15.6%
Uppercase Letter
ValueCountFrequency (%)
A 1474246
42.5%
I 598269
17.3%
C 245242
 
7.1%
H 231754
 
6.7%
L 182551
 
5.3%
O 125511
 
3.6%
P 113924
 
3.3%
D 95383
 
2.8%
S 80630
 
2.3%
Z 57610
 
1.7%
Other values (15) 262749
 
7.6%
Decimal Number
ValueCountFrequency (%)
6 3
30.0%
0 2
20.0%
1 2
20.0%
3 2
20.0%
9 1
 
10.0%
Other Punctuation
ValueCountFrequency (%)
, 2867865
> 99.9%
? 39
 
< 0.1%
/ 30
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2868231
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 31702913
84.7%
Common 5736179
 
15.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4571468
14.4%
e 2938748
 
9.3%
i 2865509
 
9.0%
o 2433279
 
7.7%
r 2317205
 
7.3%
t 2192393
 
6.9%
n 2160394
 
6.8%
p 1690401
 
5.3%
d 1537978
 
4.9%
A 1474246
 
4.7%
Other values (41) 7521292
23.7%
Common
ValueCountFrequency (%)
2868231
50.0%
, 2867865
50.0%
? 39
 
< 0.1%
/ 30
 
< 0.1%
_ 4
 
< 0.1%
6 3
 
< 0.1%
0 2
 
< 0.1%
1 2
 
< 0.1%
3 2
 
< 0.1%
9 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37439092
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4571468
12.2%
e 2938748
 
7.8%
2868231
 
7.7%
, 2867865
 
7.7%
i 2865509
 
7.7%
o 2433279
 
6.5%
r 2317205
 
6.2%
t 2192393
 
5.9%
n 2160394
 
5.8%
p 1690401
 
4.5%
Other values (51) 10533599
28.1%

kingdom
Text

Constant  Missing 

Distinct1
Distinct (%)< 0.1%
Missing6300
Missing (%)1.0%
Memory size4.6 MiB
2025-01-14T11:40:40.407441image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters4787360
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 598420
100.0%
2025-01-14T11:40:40.506927image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 1196840
25.0%
a 1196840
25.0%
A 598420
12.5%
n 598420
12.5%
m 598420
12.5%
l 598420
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4188940
87.5%
Uppercase Letter 598420
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1196840
28.6%
a 1196840
28.6%
n 598420
14.3%
m 598420
14.3%
l 598420
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 598420
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4787360
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1196840
25.0%
a 1196840
25.0%
A 598420
12.5%
n 598420
12.5%
m 598420
12.5%
l 598420
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4787360
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1196840
25.0%
a 1196840
25.0%
A 598420
12.5%
n 598420
12.5%
m 598420
12.5%
l 598420
12.5%

phylum
Text

Distinct2
Distinct (%)< 0.1%
Missing4930
Missing (%)0.8%
Memory size4.6 MiB
2025-01-14T11:40:40.554406image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters5997900
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowArthropoda
2nd rowArthropoda
3rd rowArthropoda
4th rowArthropoda
5th rowArthropoda
ValueCountFrequency (%)
arthropoda 599790
100.0%
2025-01-14T11:40:40.665862image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 1199580
20.0%
o 1199580
20.0%
a 599826
10.0%
t 599790
10.0%
h 599790
10.0%
p 599790
10.0%
d 599790
10.0%
A 599754
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5398146
90.0%
Uppercase Letter 599754
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 1199580
22.2%
o 1199580
22.2%
a 599826
11.1%
t 599790
11.1%
h 599790
11.1%
p 599790
11.1%
d 599790
11.1%
Uppercase Letter
ValueCountFrequency (%)
A 599754
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5997900
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 1199580
20.0%
o 1199580
20.0%
a 599826
10.0%
t 599790
10.0%
h 599790
10.0%
p 599790
10.0%
d 599790
10.0%
A 599754
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5997900
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 1199580
20.0%
o 1199580
20.0%
a 599826
10.0%
t 599790
10.0%
h 599790
10.0%
p 599790
10.0%
d 599790
10.0%
A 599754
10.0%

class
Text

Distinct13
Distinct (%)< 0.1%
Missing5496
Missing (%)0.9%
Memory size4.6 MiB
2025-01-14T11:40:40.716285image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length7
Mean length7.038307878
Min length7

Characters and Unicode

Total characters4217523
Distinct characters28
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowInsecta
2nd rowInsecta
3rd rowInsecta
4th rowInsecta
5th rowInsecta
ValueCountFrequency (%)
insecta 588007
98.1%
arachnida 7908
 
1.3%
diplopoda 1604
 
0.3%
collembola 798
 
0.1%
chilopoda 740
 
0.1%
diplura 76
 
< 0.1%
protura 62
 
< 0.1%
symphyla 8
 
< 0.1%
myriapoda 6
 
< 0.1%
onychophora 6
 
< 0.1%
Other values (3) 9
 
< 0.1%
2025-01-14T11:40:40.840308image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 607141
14.4%
n 595933
14.1%
c 595923
14.1%
e 588805
14.0%
t 588070
13.9%
s 588008
13.9%
I 588007
13.9%
i 10334
 
0.2%
d 10262
 
0.2%
h 8668
 
0.2%
Other values (18) 36372
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3618299
85.8%
Uppercase Letter 599224
 
14.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 607141
16.8%
n 595933
16.5%
c 595923
16.5%
e 588805
16.3%
t 588070
16.3%
s 588008
16.3%
i 10334
 
0.3%
d 10262
 
0.3%
h 8668
 
0.2%
r 8125
 
0.2%
Other values (9) 17030
 
0.5%
Uppercase Letter
ValueCountFrequency (%)
I 588007
98.1%
A 7908
 
1.3%
D 1680
 
0.3%
C 1538
 
0.3%
P 66
 
< 0.1%
S 8
 
< 0.1%
M 7
 
< 0.1%
O 6
 
< 0.1%
U 4
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 4217523
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 607141
14.4%
n 595933
14.1%
c 595923
14.1%
e 588805
14.0%
t 588070
13.9%
s 588008
13.9%
I 588007
13.9%
i 10334
 
0.2%
d 10262
 
0.2%
h 8668
 
0.2%
Other values (18) 36372
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4217523
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 607141
14.4%
n 595933
14.1%
c 595923
14.1%
e 588805
14.0%
t 588070
13.9%
s 588008
13.9%
I 588007
13.9%
i 10334
 
0.2%
d 10262
 
0.2%
h 8668
 
0.2%
Other values (18) 36372
 
0.9%

order
Text

Distinct85
Distinct (%)< 0.1%
Missing4816
Missing (%)0.8%
Memory size4.6 MiB
2025-01-14T11:40:40.926065image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length16
Mean length9.460972089
Min length5

Characters and Unicode

Total characters5675675
Distinct characters43
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowHymenoptera
2nd rowLepidoptera
3rd rowLepidoptera
4th rowOdonata
5th rowColeoptera
ValueCountFrequency (%)
hymenoptera 146434
24.4%
odonata 117300
19.6%
lepidoptera 99929
16.7%
diptera 73541
12.3%
coleoptera 72075
12.0%
hemiptera 37773
 
6.3%
siphonaptera 10088
 
1.7%
trichoptera 9110
 
1.5%
araneae 4645
 
0.8%
thysanoptera 4630
 
0.8%
Other values (73) 24379
 
4.1%
2025-01-14T11:40:41.071799image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 849639
15.0%
a 747964
13.2%
t 600150
10.6%
p 583317
10.3%
o 554705
9.8%
r 496497
8.7%
n 284722
 
5.0%
i 241800
 
4.3%
d 223179
 
3.9%
m 190651
 
3.4%
Other values (33) 903051
15.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5075771
89.4%
Uppercase Letter 599902
 
10.6%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 849639
16.7%
a 747964
14.7%
t 600150
11.8%
p 583317
11.5%
o 554705
10.9%
r 496497
9.8%
n 284722
 
5.6%
i 241800
 
4.8%
d 223179
 
4.4%
m 190651
 
3.8%
Other values (13) 303147
 
6.0%
Uppercase Letter
ValueCountFrequency (%)
H 184208
30.7%
O 119067
19.8%
L 100249
16.7%
D 73712
12.3%
C 72231
 
12.0%
T 13924
 
2.3%
S 10955
 
1.8%
P 8233
 
1.4%
A 5471
 
0.9%
M 4855
 
0.8%
Other values (9) 6997
 
1.2%
Other Punctuation
ValueCountFrequency (%)
? 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5675673
> 99.9%
Common 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 849639
15.0%
a 747964
13.2%
t 600150
10.6%
p 583317
10.3%
o 554705
9.8%
r 496497
8.7%
n 284722
 
5.0%
i 241800
 
4.3%
d 223179
 
3.9%
m 190651
 
3.4%
Other values (32) 903049
15.9%
Common
ValueCountFrequency (%)
? 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5675675
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 849639
15.0%
a 747964
13.2%
t 600150
10.6%
p 583317
10.3%
o 554705
9.8%
r 496497
8.7%
n 284722
 
5.0%
i 241800
 
4.3%
d 223179
 
3.9%
m 190651
 
3.4%
Other values (33) 903051
15.9%

family
Text

Distinct1481
Distinct (%)0.2%
Missing4937
Missing (%)0.8%
Memory size4.6 MiB
2025-01-14T11:40:41.240834image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length19
Mean length10.51244367
Min length3

Characters and Unicode

Total characters6305185
Distinct characters59
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique207 ?
Unique (%)< 0.1%

Sample

1st rowFormicidae
2nd rowGelechiidae
3rd rowSesiidae
4th rowCoenagrionidae
5th rowCarabidae
ValueCountFrequency (%)
apidae 82945
 
13.8%
libellulidae 42510
 
7.1%
coenagrionidae 35189
 
5.9%
chrysomelidae 17542
 
2.9%
asilidae 13404
 
2.2%
geometridae 12783
 
2.1%
crambidae 12086
 
2.0%
curculionidae 12016
 
2.0%
psychodidae 11788
 
2.0%
formicidae 9927
 
1.7%
Other values (1470) 349958
58.3%
2025-01-14T11:40:41.495545image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 913350
14.5%
e 888633
14.1%
a 818158
13.0%
d 670599
10.6%
o 326213
 
5.2%
l 321141
 
5.1%
r 288681
 
4.6%
p 212863
 
3.4%
n 209521
 
3.3%
h 149416
 
2.4%
Other values (49) 1506610
23.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5705019
90.5%
Uppercase Letter 599782
 
9.5%
Space Separator 365
 
< 0.1%
Decimal Number 10
 
< 0.1%
Other Punctuation 5
 
< 0.1%
Connector Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 913350
16.0%
e 888633
15.6%
a 818158
14.3%
d 670599
11.8%
o 326213
 
5.7%
l 321141
 
5.6%
r 288681
 
5.1%
p 212863
 
3.7%
n 209521
 
3.7%
h 149416
 
2.6%
Other values (16) 906444
15.9%
Uppercase Letter
ValueCountFrequency (%)
C 135683
22.6%
A 124194
20.7%
L 64733
10.8%
P 62029
10.3%
S 32302
 
5.4%
T 31841
 
5.3%
G 26661
 
4.4%
M 18029
 
3.0%
N 17189
 
2.9%
E 13602
 
2.3%
Other values (15) 73519
12.3%
Decimal Number
ValueCountFrequency (%)
6 3
30.0%
0 2
20.0%
1 2
20.0%
3 2
20.0%
9 1
 
10.0%
Space Separator
ValueCountFrequency (%)
365
100.0%
Other Punctuation
ValueCountFrequency (%)
? 5
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6304801
> 99.9%
Common 384
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 913350
14.5%
e 888633
14.1%
a 818158
13.0%
d 670599
10.6%
o 326213
 
5.2%
l 321141
 
5.1%
r 288681
 
4.6%
p 212863
 
3.4%
n 209521
 
3.3%
h 149416
 
2.4%
Other values (41) 1506226
23.9%
Common
ValueCountFrequency (%)
365
95.1%
? 5
 
1.3%
_ 4
 
1.0%
6 3
 
0.8%
0 2
 
0.5%
1 2
 
0.5%
3 2
 
0.5%
9 1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6305185
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 913350
14.5%
e 888633
14.1%
a 818158
13.0%
d 670599
10.6%
o 326213
 
5.2%
l 321141
 
5.1%
r 288681
 
4.6%
p 212863
 
3.4%
n 209521
 
3.3%
h 149416
 
2.4%
Other values (49) 1506610
23.9%

genus
Text

Distinct39740
Distinct (%)6.6%
Missing5432
Missing (%)0.9%
Memory size4.6 MiB
2025-01-14T11:40:41.695629image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length21
Mean length8.981117593
Min length1

Characters and Unicode

Total characters5382276
Distinct characters72
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14840 ?
Unique (%)2.5%

Sample

1st rowCamponotus
2nd rowAthrips
3rd rowParanthrene
4th rowAcanthagrion
5th rowCalathus
ValueCountFrequency (%)
bombus 62372
 
10.4%
xylocopa 12105
 
2.0%
unidentified 8808
 
1.5%
argia 8662
 
1.4%
enallagma 7977
 
1.3%
crambus 7970
 
1.3%
ischnura 7458
 
1.2%
sympetrum 6028
 
1.0%
apis 4969
 
0.8%
lestes 4236
 
0.7%
Other values (39686) 468802
78.2%
2025-01-14T11:40:41.966920image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 530794
 
9.9%
o 471993
 
8.8%
i 398632
 
7.4%
s 398294
 
7.4%
e 380922
 
7.1%
r 324165
 
6.0%
l 256048
 
4.8%
u 248374
 
4.6%
t 243058
 
4.5%
m 234489
 
4.4%
Other values (62) 1895507
35.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4782652
88.9%
Uppercase Letter 599244
 
11.1%
Space Separator 99
 
< 0.1%
Open Punctuation 77
 
< 0.1%
Close Punctuation 77
 
< 0.1%
Other Punctuation 68
 
< 0.1%
Decimal Number 31
 
< 0.1%
Connector Punctuation 23
 
< 0.1%
Dash Punctuation 4
 
< 0.1%
Final Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 530794
11.1%
o 471993
 
9.9%
i 398632
 
8.3%
s 398294
 
8.3%
e 380922
 
8.0%
r 324165
 
6.8%
l 256048
 
5.4%
u 248374
 
5.2%
t 243058
 
5.1%
m 234489
 
4.9%
Other values (17) 1295883
27.1%
Uppercase Letter
ValueCountFrequency (%)
B 76921
12.8%
P 69062
11.5%
A 66431
11.1%
C 64157
10.7%
E 40523
 
6.8%
S 37144
 
6.2%
L 31252
 
5.2%
T 27870
 
4.7%
H 27834
 
4.6%
M 26250
 
4.4%
Other values (16) 131800
22.0%
Decimal Number
ValueCountFrequency (%)
0 10
32.3%
1 7
22.6%
3 5
16.1%
2 3
 
9.7%
6 3
 
9.7%
4 2
 
6.5%
9 1
 
3.2%
Other Punctuation
ValueCountFrequency (%)
? 48
70.6%
. 16
 
23.5%
/ 3
 
4.4%
! 1
 
1.5%
Open Punctuation
ValueCountFrequency (%)
[ 55
71.4%
( 22
 
28.6%
Close Punctuation
ValueCountFrequency (%)
] 55
71.4%
) 22
 
28.6%
Space Separator
ValueCountFrequency (%)
99
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 23
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5381896
> 99.9%
Common 380
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 530794
 
9.9%
o 471993
 
8.8%
i 398632
 
7.4%
s 398294
 
7.4%
e 380922
 
7.1%
r 324165
 
6.0%
l 256048
 
4.8%
u 248374
 
4.6%
t 243058
 
4.5%
m 234489
 
4.4%
Other values (43) 1895127
35.2%
Common
ValueCountFrequency (%)
99
26.1%
[ 55
14.5%
] 55
14.5%
? 48
12.6%
_ 23
 
6.1%
( 22
 
5.8%
) 22
 
5.8%
. 16
 
4.2%
0 10
 
2.6%
1 7
 
1.8%
Other values (9) 23
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5382258
> 99.9%
None 17
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 530794
 
9.9%
o 471993
 
8.8%
i 398632
 
7.4%
s 398294
 
7.4%
e 380922
 
7.1%
r 324165
 
6.0%
l 256048
 
4.8%
u 248374
 
4.6%
t 243058
 
4.5%
m 234489
 
4.4%
Other values (60) 1895489
35.2%
None
ValueCountFrequency (%)
ö 17
100.0%
Punctuation
ValueCountFrequency (%)
1
100.0%

subgenus
Text

Missing 

Distinct3170
Distinct (%)3.4%
Missing512525
Missing (%)84.8%
Memory size4.6 MiB
2025-01-14T11:40:42.164770image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length18
Mean length9.945918976
Min length1

Characters and Unicode

Total characters916964
Distinct characters57
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1134 ?
Unique (%)1.2%

Sample

1st rowMyrmosericus
2nd rowAnomalagrion
3rd rowAnomalagrion
4th rowHypocaccus
5th rowBombus
ValueCountFrequency (%)
pyrobombus 21248
23.0%
bombus 7225
 
7.8%
apis 3633
 
3.9%
fervidobombus 3293
 
3.6%
neoxylocopa 2426
 
2.6%
alpinobombus 1554
 
1.7%
xylocopoides 1492
 
1.6%
schonnherria 1460
 
1.6%
separatobombus 1325
 
1.4%
chimarra 1296
 
1.4%
Other values (3159) 47264
51.3%
2025-01-14T11:40:42.439956image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 129387
14.1%
s 73941
 
8.1%
b 73482
 
8.0%
r 63107
 
6.9%
m 58165
 
6.3%
u 57827
 
6.3%
a 57453
 
6.3%
i 52141
 
5.7%
y 39777
 
4.3%
e 39462
 
4.3%
Other values (47) 272222
29.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 824673
89.9%
Uppercase Letter 92195
 
10.1%
Other Punctuation 74
 
< 0.1%
Space Separator 21
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 129387
15.7%
s 73941
9.0%
b 73482
8.9%
r 63107
 
7.7%
m 58165
 
7.1%
u 57827
 
7.0%
a 57453
 
7.0%
i 52141
 
6.3%
y 39777
 
4.8%
e 39462
 
4.8%
Other values (17) 179931
21.8%
Uppercase Letter
ValueCountFrequency (%)
P 28486
30.9%
A 9332
 
10.1%
B 8660
 
9.4%
S 6505
 
7.1%
M 5495
 
6.0%
C 5417
 
5.9%
N 4425
 
4.8%
F 4007
 
4.3%
T 3288
 
3.6%
D 2690
 
2.9%
Other values (16) 13890
15.1%
Other Punctuation
ValueCountFrequency (%)
. 70
94.6%
? 4
 
5.4%
Space Separator
ValueCountFrequency (%)
21
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 916868
> 99.9%
Common 96
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 129387
14.1%
s 73941
 
8.1%
b 73482
 
8.0%
r 63107
 
6.9%
m 58165
 
6.3%
u 57827
 
6.3%
a 57453
 
6.3%
i 52141
 
5.7%
y 39777
 
4.3%
e 39462
 
4.3%
Other values (43) 272126
29.7%
Common
ValueCountFrequency (%)
. 70
72.9%
21
 
21.9%
? 4
 
4.2%
- 1
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 916963
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 129387
14.1%
s 73941
 
8.1%
b 73482
 
8.0%
r 63107
 
6.9%
m 58165
 
6.3%
u 57827
 
6.3%
a 57453
 
6.3%
i 52141
 
5.7%
y 39777
 
4.3%
e 39462
 
4.3%
Other values (46) 272221
29.7%
None
ValueCountFrequency (%)
ö 1
100.0%

specificEpithet
Text

Missing 

Distinct88940
Distinct (%)14.9%
Missing8751
Missing (%)1.4%
Memory size4.6 MiB
2025-01-14T11:40:42.687896image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length25
Mean length8.294070665
Min length1

Characters and Unicode

Total characters4943009
Distinct characters53
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50119 ?
Unique (%)8.4%

Sample

1st rowrufoglaucus
2nd rowmesoleuca
3rd rowasilipennis
4th rowtrilobatum
5th rownanulus
ValueCountFrequency (%)
sp 44400
 
7.4%
sylvicola 6285
 
1.1%
bifarius 4078
 
0.7%
kirbyellus 3621
 
0.6%
flavifrons 3483
 
0.6%
impatiens 3134
 
0.5%
undetermined 3047
 
0.5%
nevadensis 2529
 
0.4%
cerana 2431
 
0.4%
affinis 2295
 
0.4%
Other values (88797) 521298
87.4%
2025-01-14T11:40:42.982773image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 624165
12.6%
i 556726
11.3%
s 471813
 
9.5%
e 377708
 
7.6%
n 324342
 
6.6%
l 324007
 
6.6%
r 301739
 
6.1%
u 289537
 
5.9%
t 260984
 
5.3%
c 231687
 
4.7%
Other values (43) 1180301
23.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4896540
99.1%
Other Punctuation 44585
 
0.9%
Decimal Number 697
 
< 0.1%
Space Separator 632
 
< 0.1%
Connector Punctuation 289
 
< 0.1%
Dash Punctuation 249
 
< 0.1%
Open Punctuation 8
 
< 0.1%
Close Punctuation 8
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 624165
12.7%
i 556726
11.4%
s 471813
9.6%
e 377708
 
7.7%
n 324342
 
6.6%
l 324007
 
6.6%
r 301739
 
6.2%
u 289537
 
5.9%
t 260984
 
5.3%
c 231687
 
4.7%
Other values (18) 1133832
23.2%
Decimal Number
ValueCountFrequency (%)
1 205
29.4%
9 106
15.2%
0 83
11.9%
2 72
 
10.3%
3 59
 
8.5%
4 53
 
7.6%
6 41
 
5.9%
7 30
 
4.3%
5 30
 
4.3%
8 18
 
2.6%
Other Punctuation
ValueCountFrequency (%)
. 44494
99.8%
? 43
 
0.1%
# 34
 
0.1%
/ 9
 
< 0.1%
' 2
 
< 0.1%
; 2
 
< 0.1%
, 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 5
62.5%
[ 3
37.5%
Close Punctuation
ValueCountFrequency (%)
) 5
62.5%
] 3
37.5%
Space Separator
ValueCountFrequency (%)
632
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 289
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 249
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4896540
99.1%
Common 46469
 
0.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 624165
12.7%
i 556726
11.4%
s 471813
9.6%
e 377708
 
7.7%
n 324342
 
6.6%
l 324007
 
6.6%
r 301739
 
6.2%
u 289537
 
5.9%
t 260984
 
5.3%
c 231687
 
4.7%
Other values (18) 1133832
23.2%
Common
ValueCountFrequency (%)
. 44494
95.7%
632
 
1.4%
_ 289
 
0.6%
- 249
 
0.5%
1 205
 
0.4%
9 106
 
0.2%
0 83
 
0.2%
2 72
 
0.2%
3 59
 
0.1%
4 53
 
0.1%
Other values (15) 227
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4943006
> 99.9%
None 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 624165
12.6%
i 556726
11.3%
s 471813
 
9.5%
e 377708
 
7.6%
n 324342
 
6.6%
l 324007
 
6.6%
r 301739
 
6.1%
u 289537
 
5.9%
t 260984
 
5.3%
c 231687
 
4.7%
Other values (41) 1180298
23.9%
None
ValueCountFrequency (%)
ñ 2
66.7%
ö 1
33.3%

infraspecificEpithet
Text

Missing 

Distinct8352
Distinct (%)24.9%
Missing571231
Missing (%)94.5%
Memory size4.6 MiB
2025-01-14T11:40:43.161399image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length22
Mean length8.8483084
Min length1

Characters and Unicode

Total characters296321
Distinct characters38
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5802 ?
Unique (%)17.3%

Sample

1st rowrufigenis
2nd rowdecrescens
3rd rowmarianae
4th rowneglectum
5th rowlavatus
ValueCountFrequency (%)
nearcticus 2527
 
7.5%
fervidus 1188
 
3.5%
violacea 992
 
3.0%
pensylvanicus 904
 
2.7%
vagans 870
 
2.6%
portia 724
 
2.2%
virginica 593
 
1.8%
auricormus 587
 
1.8%
auripennis 578
 
1.7%
dorsata 440
 
1.3%
Other values (8332) 24136
72.0%
2025-01-14T11:40:43.410607image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 38963
13.1%
i 34627
11.7%
s 26724
9.0%
n 23021
 
7.8%
r 21553
 
7.3%
e 21399
 
7.2%
u 19000
 
6.4%
c 18314
 
6.2%
t 14569
 
4.9%
o 13863
 
4.7%
Other values (28) 64288
21.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 296210
> 99.9%
Space Separator 50
 
< 0.1%
Other Punctuation 39
 
< 0.1%
Uppercase Letter 6
 
< 0.1%
Dash Punctuation 5
 
< 0.1%
Open Punctuation 4
 
< 0.1%
Close Punctuation 4
 
< 0.1%
Decimal Number 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 38963
13.2%
i 34627
11.7%
s 26724
9.0%
n 23021
 
7.8%
r 21553
 
7.3%
e 21399
 
7.2%
u 19000
 
6.4%
c 18314
 
6.2%
t 14569
 
4.9%
o 13863
 
4.7%
Other values (16) 64177
21.7%
Other Punctuation
ValueCountFrequency (%)
. 23
59.0%
? 14
35.9%
/ 2
 
5.1%
Decimal Number
ValueCountFrequency (%)
1 1
33.3%
2 1
33.3%
6 1
33.3%
Uppercase Letter
ValueCountFrequency (%)
V 5
83.3%
C 1
 
16.7%
Space Separator
ValueCountFrequency (%)
50
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 296216
> 99.9%
Common 105
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 38963
13.2%
i 34627
11.7%
s 26724
9.0%
n 23021
 
7.8%
r 21553
 
7.3%
e 21399
 
7.2%
u 19000
 
6.4%
c 18314
 
6.2%
t 14569
 
4.9%
o 13863
 
4.7%
Other values (18) 64183
21.7%
Common
ValueCountFrequency (%)
50
47.6%
. 23
21.9%
? 14
 
13.3%
- 5
 
4.8%
( 4
 
3.8%
) 4
 
3.8%
/ 2
 
1.9%
1 1
 
1.0%
2 1
 
1.0%
6 1
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 296321
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 38963
13.1%
i 34627
11.7%
s 26724
9.0%
n 23021
 
7.8%
r 21553
 
7.3%
e 21399
 
7.2%
u 19000
 
6.4%
c 18314
 
6.2%
t 14569
 
4.9%
o 13863
 
4.7%
Other values (28) 64288
21.7%

taxonRank
Text

Missing 

Distinct17
Distinct (%)0.1%
Missing571236
Missing (%)94.5%
Memory size4.6 MiB
2025-01-14T11:40:43.481484image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length10
Mean length9.835861904
Min length4

Characters and Unicode

Total characters329344
Distinct characters30
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowVariety
2nd rowsubspecies
3rd rowsubspecies
4th rowsubspecies
5th rowsubspecies
ValueCountFrequency (%)
subspecies 31600
94.3%
variety 1483
 
4.4%
aberration 168
 
0.5%
form 104
 
0.3%
race 69
 
0.2%
morphotype 28
 
0.1%
species 10
 
< 0.1%
group 10
 
< 0.1%
undet.cat 9
 
< 0.1%
var 5
 
< 0.1%
Other values (4) 10
 
< 0.1%
2025-01-14T11:40:43.599006image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 94823
28.8%
e 64977
19.7%
i 33273
 
10.1%
b 31768
 
9.6%
p 31681
 
9.6%
c 31679
 
9.6%
u 31610
 
9.6%
r 1986
 
0.6%
a 1745
 
0.5%
t 1706
 
0.5%
Other values (20) 4096
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 327473
99.4%
Uppercase Letter 1836
 
0.6%
Other Punctuation 23
 
< 0.1%
Space Separator 12
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 94823
29.0%
e 64977
19.8%
i 33273
 
10.2%
b 31768
 
9.7%
p 31681
 
9.7%
c 31679
 
9.7%
u 31610
 
9.7%
r 1986
 
0.6%
a 1745
 
0.5%
t 1706
 
0.5%
Other values (10) 2225
 
0.7%
Uppercase Letter
ValueCountFrequency (%)
V 1470
80.1%
A 165
 
9.0%
F 92
 
5.0%
R 58
 
3.2%
M 28
 
1.5%
U 9
 
0.5%
C 9
 
0.5%
S 5
 
0.3%
Other Punctuation
ValueCountFrequency (%)
. 23
100.0%
Space Separator
ValueCountFrequency (%)
12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 329309
> 99.9%
Common 35
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 94823
28.8%
e 64977
19.7%
i 33273
 
10.1%
b 31768
 
9.6%
p 31681
 
9.6%
c 31679
 
9.6%
u 31610
 
9.6%
r 1986
 
0.6%
a 1745
 
0.5%
t 1706
 
0.5%
Other values (18) 4061
 
1.2%
Common
ValueCountFrequency (%)
. 23
65.7%
12
34.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 329344
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 94823
28.8%
e 64977
19.7%
i 33273
 
10.1%
b 31768
 
9.6%
p 31681
 
9.6%
c 31679
 
9.6%
u 31610
 
9.6%
r 1986
 
0.6%
a 1745
 
0.5%
t 1706
 
0.5%
Other values (20) 4096
 
1.2%
Distinct10001
Distinct (%)1.9%
Missing90502
Missing (%)15.0%
Memory size4.6 MiB
2025-01-14T11:40:43.794409image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length43
Median length33
Mean length7.761809194
Min length2

Characters and Unicode

Total characters3991262
Distinct characters83
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3229 ?
Unique (%)0.6%

Sample

1st rowForel
2nd row(Lower)
3rd row(Guérin-Méneville)
4th rowLeonard
5th rowCasey
ValueCountFrequency (%)
25801
 
4.4%
hagen 24579
 
4.1%
cresson 22178
 
3.7%
selys 21328
 
3.6%
casey 19749
 
3.3%
say 14238
 
2.4%
fabricius 13983
 
2.4%
alexander 9897
 
1.7%
smith 9578
 
1.6%
kirby 8910
 
1.5%
Other values (6005) 422402
71.3%
2025-01-14T11:40:44.063602image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 427913
 
10.7%
a 306441
 
7.7%
r 298135
 
7.5%
n 241900
 
6.1%
s 234839
 
5.9%
i 207245
 
5.2%
l 195525
 
4.9%
o 172511
 
4.3%
( 140296
 
3.5%
) 140296
 
3.5%
Other values (73) 1626161
40.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3036622
76.1%
Uppercase Letter 564533
 
14.1%
Open Punctuation 140297
 
3.5%
Close Punctuation 140297
 
3.5%
Space Separator 78425
 
2.0%
Other Punctuation 27987
 
0.7%
Dash Punctuation 3101
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 427913
14.1%
a 306441
10.1%
r 298135
9.8%
n 241900
 
8.0%
s 234839
 
7.7%
i 207245
 
6.8%
l 195525
 
6.4%
o 172511
 
5.7%
t 118977
 
3.9%
u 111745
 
3.7%
Other values (36) 721391
23.8%
Uppercase Letter
ValueCountFrequency (%)
C 79152
14.0%
S 78874
14.0%
H 48773
 
8.6%
B 48507
 
8.6%
M 33646
 
6.0%
D 32425
 
5.7%
F 31311
 
5.5%
W 28701
 
5.1%
L 28670
 
5.1%
R 25232
 
4.5%
Other values (17) 129242
22.9%
Other Punctuation
ValueCountFrequency (%)
& 25801
92.2%
. 1801
 
6.4%
' 383
 
1.4%
, 2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 140296
> 99.9%
[ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 140296
> 99.9%
] 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
78425
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3101
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3601155
90.2%
Common 390107
 
9.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 427913
 
11.9%
a 306441
 
8.5%
r 298135
 
8.3%
n 241900
 
6.7%
s 234839
 
6.5%
i 207245
 
5.8%
l 195525
 
5.4%
o 172511
 
4.8%
t 118977
 
3.3%
u 111745
 
3.1%
Other values (63) 1285924
35.7%
Common
ValueCountFrequency (%)
( 140296
36.0%
) 140296
36.0%
78425
20.1%
& 25801
 
6.6%
- 3101
 
0.8%
. 1801
 
0.5%
' 383
 
0.1%
, 2
 
< 0.1%
[ 1
 
< 0.1%
] 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3984593
99.8%
None 6669
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 427913
 
10.7%
a 306441
 
7.7%
r 298135
 
7.5%
n 241900
 
6.1%
s 234839
 
5.9%
i 207245
 
5.2%
l 195525
 
4.9%
o 172511
 
4.3%
( 140296
 
3.5%
) 140296
 
3.5%
Other values (52) 1619492
40.6%
None
ValueCountFrequency (%)
é 2819
42.3%
ü 1605
24.1%
ö 1059
 
15.9%
á 557
 
8.4%
ä 442
 
6.6%
ã 32
 
0.5%
ý 22
 
0.3%
ó 21
 
0.3%
ç 19
 
0.3%
è 17
 
0.3%
Other values (11) 76
 
1.1%

vernacularName
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing604718
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:40:44.120250image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters8
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowType
2nd rowType
ValueCountFrequency (%)
type 2
100.0%
2025-01-14T11:40:44.220862image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
T 2
25.0%
y 2
25.0%
p 2
25.0%
e 2
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
75.0%
Uppercase Letter 2
 
25.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
y 2
33.3%
p 2
33.3%
e 2
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 2
25.0%
y 2
25.0%
p 2
25.0%
e 2
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 2
25.0%
y 2
25.0%
p 2
25.0%
e 2
25.0%